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Annual  Review  Introduction 

We  continue  to  engage  in  large  multi-person  re 
search  projects  and  in  individual  research.  Two  of  our 
largest  projects  are  the  design  of  hardware  and  soft¬ 
ware  for  our  multi-mini-processor  computer  system 
(C.mmp)  and  the  building  of  the  Hearsay  speech 
understanding  system. 

By  June  1973  C.mmp  had  grown  to  'hree  proces¬ 
sors  and  four  memory  ports.  By  the  end  of  summer 
the  It  x  16  switch  will  be  ready,  giving  us  the 
potential  of  a  16-processor  configuration.  The  kernel 
of  the  operating  system  is  running  on  the  prototype 
and  is  being  driven  by  test  programs.  A  piece  of  the 
speech  system  is  now  up  on  the  C.mmp. 

The  Hearsay  system  is  now  operational  and  was 
demonstrated  live  at  several  workshops.  This  system 
demonstrates  the  use  of  context,  syntax,  and  seman¬ 
tics  in  a  speech  recognition  task  and  represents  a 
significant  milestone  in  the  cooperative  speech  under¬ 
standing  research  effort  that  is  presently  underway  at 
several  universities  and  research  institutions. 

We  view  workshops  and  symposia  as  one  of  our 
major  links  for  research  communication  with  the  rest 
of  the  world. 

A  nine  day  Workshop  (joint  with  Psychology) 
explored  New  Techniques  in  Cognitive  Research.  The 
"new  techniques"  are  programming  systems  that  em¬ 
body  within  themselves  significant  psychological 
theory  which  one  explores  and  uses  to  construct  new 
theory  interactively.  7  he  nine  days  were  spent  on-line 
and  a  major  purpose  of  the  Workshop  was  to  assess 
the  advantages  of  this  sort  of  scientific  communica¬ 
tion  rather  than  the  usual  talk-intensive  workshops. 
The  Workshop  was  a  success  and  we  are  engaging  in 
seven  smaller  but  similar  workshops  this  summer. 

Other  workshops  dealt  with  Architecture  and  Ap¬ 
plication  of  Digital  Modules  and  with  Segmentation 
and  Classification  of  Connected  Speech. 

A  group  of  IBM  scientists  and  managers  visited  for 
a  two  day  CMU-IBM  Minisymposium  on  current  com¬ 
puter  science  research  at  the  two  institutions.  A 
Symposium  on  Complexity  of  Sequential  and  Parallel 
Numerical  Algorithms  provided  a  forum  for  the  pre¬ 
sentation  and  discussion  of  recent  research  results  and 
surveys  on  topics  such  as  the  interdependence  of 
machine  organization  and  algorithms,  and  algel  raic 
and  analytic  computational  complexity. 


Design  Augmentation 

Charles  M.  Eastman 

In  1963,  Steven  Coons  described  the  potential  of  the 
computer  in  design  as  follows: 

"We  outlined...  a  system  that  would  in  effect 
join  man  and  machine  in  an  intimate  cooperative 
complex,  a  combination  that  would  use  the  crea¬ 
tive  and  imaginative  powers  of  the  man  and  the 
analytical  and  computational  powers  of  the  ma¬ 
chine  each  with  the  greatest  possible  economy  and 
efficiency.  We  envisioned  even  then  the  designer 
seated  at  a  console,  drawing  a  sketch  of  his  pro¬ 
posed  device  on  the  screen  of  an  oscilloscope  tube 
with  a  "light  pen",  modifying  his  sketch  at  will, 
and  commanding  the  computer  slave  to  refine  the 
sketch  into  a  perfect  drawing,  to  perform  various 
numerical  analyses  having  to  do  with  structural 
strength,  clearances  of  adjacent  parts,  and  other 
analyses  as  well  .  .  .  ”  ^1  . 

it  is  now  ten  years  later;  a  wide  variety  of  graphic 
terminals  have  become  available,  yet  this  conception 
has  been  realized  in  only  vary  limited  areas.  One  can 
attribute  a  variety  of  causes  to  the  failure  of  Coons' 
image  being  realized.  Among  them  must  be  included: 

a.  poor  understanding  of  most  design  tasks.  In  most 
areas  we  still  lack  a  clear  picture  of  the  informa¬ 
tion  typically  available  for  use  in  decision-making, 
the  sequences  of  decisions  reo  jired  due  to  external 
requirements,  and  the  mode  of  problem  solving 
normally  used  by  designers  in  a  particular  field. 

b.  restricted  system  designs.  Little  effort  has  been 
devoted  to  the  matching  of  design  tasks  to  CAD 
system  capabilities.  The  generality  of  dcta  struc¬ 
tures,  operations,  and  forms  of  analysis  has  been 
limited,  at  least  in  part,  to  technical  problems  of 
software  organization. 


c.  an  arbitrarily  restric  ted  view  of  the  contribution  of 
the  computer  to  design.  Most  efforts  in  CAD  have 
begun  by  partitioning  design  into  two  sets  of  tasks, 
those  algorithmically  defined  subtasks,  and  all 
others,  as  Coons  has  done  above.  The  machine 
undertakes  the  first  set  while  the  human  designer 
"fills  in"  to  complete  all  the  others.  This  partition¬ 
ing  is  often  an  inefficient  use  of  both  man  and 
machine  and  inevitably  leads  to  questionable  sys¬ 
tems  organization  assumptions. 

Since  about  1967,  a  group  o'  faculty  and  students 
at  CMU  has  been  addressing  the  above  technical  issues 
associated  with  computer-aided  design,  particularly  as 
applied  to  architecture,  civil  engineering,  and  indus¬ 
trial  equipment  design.  In  this  paper,  I  review  these 
efforts  and  outline  what  I  believe  to  be  their  con¬ 
tribution  to  date. 

Task  Analysis 

Design  is  often  considered  an  art,  particularly  if  it 
is  oriented  towards  buildings  or  other  public  prod¬ 
ucts.  Case  studies  and  more  rigorous  analyses  of  the 
process  of  design,  as  carried  out  traditionally,  have 
only  begun  tc  clarify  for  computer  system  designers 
the  tasks  involved  in  design  and  their  possible  organi¬ 
zations. 

Design  is  a  process  of  long  duration;  a  complex 
design  may  take  several  years  to  complete.  The  range 
of  activity  involved  in  such  an  extensive  process  re¬ 
quires  a  carefully  structured  analysis.  Analyses  of  the 
design  process  have  been  attempted  at  three  levels  of 
detail.  The  first  and  most  general  level  might  be  called 
the  molar  level.  Its  duration  is  the  total  length  of 
design  (months  or  years),  and  the  decisions  it  ex¬ 
amines  are  usually  collected  through  a  case  study  ( i.e . 
recall)  or  diary  format;  the  actions  characterized  are 
most  often  those  of  a  group.  The  kinds  of  informa¬ 
tion  normally  collected  at  the  molar  level  include  the 
general  sequence  in  which  major  design  decisions  are 
made,  what  those  decisions  are,  the  sequence  in 
which  important  information  is  received,  and  from 
whom. 

It  is  also  possible  to  analyze  subsets  of  design 
decisions.  Design  problems  can  be  defined  that  are 
the  appropriate  province  of  a  single  decision-maker. 
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In  this  research  context,  the  information  brought  to 
bear,  the  external  representation  of  informat'on,  and 
other  processing  of  information  by  an  individual  de¬ 
signer  can  be  carefully  monitored.  This  level  of  detail 
might  be  called  the  molecular  level  of  design  analysis. 
Usually  this  level  of  analysis  is  characterized  as  design 
"problem-solving"  and  is  amenable  to  the  techniques 
of  analysis  Newell  has  developed  for  problem  solving 
research  1151 .  Design  actions  at  the  molecular  level 
assume  as  primitives  both  standard  methods  of  analy¬ 
sis,  where  these  are  formalized,  and  cognitive  proc¬ 
esses,  where  no  formalization  exists.  Each  design 
problem  analyzed  at  this  level  also  is  assumed  to  be  a 
single  element  in  the  analysis  of  case  studies.  Thus 
this  intermediate  level  of  analysis  relates  primitive 
perceptual  and  cognitive  processes  to  the  global  or¬ 
ganization  of  design. 

The  lowest  level  at  which  design  has  been  studied 
may  be  called  the  atomic  level  (in  accordance  with 
our  physical  science  analogy).  At  this  level,  the  re¬ 
searcher  is  concerned  with  the  primitive  capabilities 
required  to  analyze  and  synthesize  physical  systems. 
Many  of  the  analytic  primitives  have  received  much 
attention,  e.g.,  procedures  for  predicting  the  behavior 
of  a  structure  to  itatic  loads.  Others  are  of  a  psycho¬ 
logical  nature  and  have  only  begun  to  be  explored. 
These  include: 

a.  the  structure  of  human  memory  about  the  physi¬ 
cal  and  visual  world  and  strategies  for  accessing 
information  within  this  structure; 

b.  matching  a  verbal  description  of  a  condition  with 
a  graphic  pattern  corresponding  to  that  condition, 
or  vice  versa,  deriving  a  verbal  description  of  a 
graphic  pattern  —  in  general,  creating  a  description 
or  deiiving  a  correspondence  in  one  kind  of  lan¬ 
guage  from  another  language; 

c.  testinj  visually  if  an  object  fits  within  a  given 
space. 

The  study  of  primitive  operations  normally  involves 
laboratory  experimentation. 

Several  studies  here  have  contributed  to  the  body 
of  knowledge  regarding  the  process  of  design.  East¬ 
man  and  Yessioshave  undertaken  type  two,  or  molec¬ 
ular  level  studies  13,4,5,181  Ba||ay  and  Moran  have 
studied  type  three  processes  I1-1  31.  Studies  under¬ 
taken  elsewhere  have  focused  on  molar  studies.  These 
studies  have  allowed  us  to  elaborate  our  understand¬ 
ing  of  design  and  to  determine  the  context  for  future 
studies. 


Rather  than  relate  the  results  of  particular  studies, 

I  shall  attempt  to  generalize  from  them.  Of  necessity, 
there  generalizations  are  interpretive,  but  suggest  im¬ 
portant  criteria  for  the  design  of  CAD  systems.  In 
particular,  no  single  sequential  structure  is  likely  to 
be  adequate  for  use  by  different  designers  in  different 
contexts.  While  a  common  set  of  operations  may 
eventually  evolve,  the  unique  information  gained 
from  the  application  of  each  will  lead  to  a  different, 
possibly  unique  sequence  l4 1  . 

Also,  design  problems  are  usually  both  ill-struc¬ 
tured  and  ill-defined.  That  is,  they  are  not  easily 
characterized  within  any  one  representation  and  they 
initially  are  only  partially  defined.  The  designer  is 
responsible  for  both  structuring  the  problem  and 
completing  its  definition.  He  normally  does  so  today 
through  an  iterative  process  of  partial  definition  and 
resolution.  Solutions  are  used  to  prompt  his  experi¬ 
ence  for  the  purpose  of  elaborating  the  problem  defi¬ 
nition.  This  method  of  problem  solving  benefits  from 
displays  of  partial  solutions  in  multiple  repre¬ 
sentations  l3' . 

The  strategies  used  by  designers  correspond  closely 
to  the  general  problem  solving  processes  called  heuris¬ 
tic  search.  Generate-and-test,  mear'-ends  analysis, 
and  planning  all  can  be  observed  in  design  protocols 
collected  at  the  molecular  level;  often  they  are  inter¬ 
mixed.  These  results  suggest  that  a  CAD  system 
should  incorporate  capabilities  for  tests,  means-ends 
tables,  and  the  mapping  capabilities  needed  for  plan¬ 
ning  l3' . 

As  recognized  by  others,  intuitive  design  is  hier¬ 
archical  and  sequential;  the  subset  of  variableshaving 
global  effects  are  abstracted  for  early  decisions,  while 
others  of  only  local  significance  are  generally  resolved 
later.  Decision  sequences  are  also  influenced  by  the 
external  constraints  upon  variables  posed  by  a  partic¬ 
ular  ro"“  t.  Because  each  design  problem  comes 
with  e  unique  set  of  constraints,  different  variables 
are  initially  bounded  in  different  problems.  The  se¬ 
quence  of  alignments  to  variables  is  partially  deter¬ 
mined  by  thee,  binding;  those  tightly  bounded  are 
assigned  early  (btfore  they  become  overconstrained). 
Thus  the  intuitive  sequence  of  decision-making  used 
hy  humans  in  each  design  problem  may  vary  13,13) 


Many  design  problems  are  underconstrained  and 
have  no  precise  objective  function.  Without  greater 
information  regarding  goals,  a  great  range  of  solutions 
is  possible.  Moreover,  the  search  of  the  problem  space 
for  a  specific  solution  is  potentially  inefficient,  due  to 
the  lack  of  constraints  for  partitioning  the  domain 
down  to  manageable  size.  In  this  context,  designers 
often  add  constraints  to  simplify  their  own  problem 
solving.  These  constraints  reflect  subjective  concerns 
and  are  a  major  component  in  the  art  of  design. 
Traditionally,  the  adding  of  constraints  has  been  an 
important  prerogative  of  designers. 

The  mental  representations  of  form  and  the  opera¬ 
tions  on  them  used  by  individuals  correspond  closely 
to  their  perceptual  and  manipulative  experience. 
Sculptors  manipulate  forms  in  terms  of  the  carving 
operations  required  to  generate  them  from  a  simple 
blot  k,  draftsmen  use  projective  geometry,  and  an  art 
hist'  rian  is  likely  to  use  historical  analogies.  The 
inter  lal  representations  used  by  humans  in  design 
thus  evolve  from  perceptual  and  tactile  experience 
1  •  These  representational  differences  are  an  impor¬ 
tant  source  of  variation  in  human  design. 

System  Configurations  for  Computer-Aided  Design 

System  design  research  here  at  CMU  has  followed 
an  evolution  represented  by  a  sequence  of  programs 
for  computer-aided  design,  To  facilitate  later  refer¬ 
ence  to  them,  I  shall  first  give  their  names.  The  first 
large  effort  completed  here  was  Grason's  GRAMPA, 
implemented  in  1970.  This  was  followed  by  East¬ 
man's  GSP  I7-8)  and  Pfefferkorn's  DPS  [’6]  In 

1972,  Yessios  implemented  FOSPLAN  ‘2°1 ,  then  in 

1973,  SIPLAN  I21 ) ,  Several  small  programs  have  also 
been  implemented  during  the  same  period.  Below  I 
review  each  of  these  systems  in  terms  of  their  repre¬ 
sentation  of  space  and  treatment  of  constraints.  Last¬ 
ly,  I  outline  research  in  design  languages. 

A  basic  issue  to  be  resolved  in  the  design  of  any 
computer  system  is  the  organization  of  data  for  easy 
manipulation.  The  issue  has  broad  implications,  as 
different  characterizations  of  the  original  design  task 
lead  not  only  to  different  data  structures,  but  also  to 
nonisomorphic  operators  thaf  may  cause  drastic  dif¬ 
ferences  in  problem  solving  difficulty,  I  am  speaking, 
of  course,  of  the  ubiquitous  representation  issue.  A 
variety  of  representations  of  the  physical  elements 
and  space  involved  in  design  have  been  developed  and 
explored  here  at  CMU.  A  common  property  of  all  of 
them  has  been  their  explicit  treatment  of  the  integer 
constraint  regarding  allocations  in  the  space-time  con¬ 
tinuum  —  any  point  in  space  may  be  occupied  by 
only  one  element  at  a  time. 


One  of  the  earliest  representations  used  was  the 
variable  domain  array  t6-8l .  See  Figures  la  and  1b.  It 
is  a  two-  or  three-dimensional  array,  each  variable 
with  non-zero  subscripts  representing  a  rectangular 
domain.  The  dimensions  of  the  domain  were  defined 
in  the  zero  vectors  in  each  dimension;  the  X  and  Y 
dimensions  of  cvjj  were  otjQ  and  cxqj,  respectively.  The 
values  of  the  non-zero  variables  characterized  the 
state  of  each  space,  e.g.,  whether  empty  or  filled  and, 
if  filled,  by  what  object.  This  representation,  while 
limited  to  rectangular  domains,  incorporates  certain 
features  which  seem  highly  desirable  for  CAD  systems 
and  which  have  been  incorporated  into  representa¬ 
tions  developed  later.  Both  filled  and  empty  space  are 
characterized,  allowing  the  easy  locating  of  new  ob¬ 
jects  in  non-overlapping  airangements.  The  value 
stored  to  depict  an  occupied  domain  is  also  a  pointer 
that  may  be  used  to  reference  properties,  spatial  or 
non-spatial,  not  defined  in  the  array.  It  also  shows  the 
relation  between  domain  locations  and  sizes;  a  com¬ 
plete  description  of  size  allows  derivation  of  location 
through  proper  summations.  The  converse  is  not  true; 
in  the  general  case,  all  locations  do  not  allow  deriva¬ 
tion  of  sizes.  The  mapping  from  sizes  to  locations  is 
from  many  variables  to  one.  An  early  program  using 
the  variable  domain  array  was  written  by  Moran  in 
LiSP  1141 .  Later  work  has  relied  on  ALGOL  and 
FORTRAN  18] 


An  alternative  representation  was  developed  in 
John  Grason's  thes:;  research  ^  1  ^  and  consisted  of  a 
dual,  colored,  and  directed  graph.  See  Figure  1c. 
Instead  of  representing  domains,  each  variah  ’  depicts 
adjacencies  between  empty  or  filled  spaces.  The 
dashed  edges  depict  west-east  adjacencies,  while  the 
solid  edges  depict  south-north  adjacencies.  Direction 
of  the  edges,  e.g. ,  to  or  from  a  node,  depict  orienta¬ 
tion.  A  node  depicts  a  space.  Locations  are  altered  by 
reconnecting  edges.  Overlaps  never  occur  as  long  as 
the  graph  remains  planar.  This  colored  and  directed 
graph  is  the  dual  of  a  graph  in  which  edges  depict 
walls  in  the  standard  manner.  The  coloring  and  direc¬ 
tions  impose  a  one-to-one  mapping  between  a  floor- 
plan  and  this  form  of  graph.  This  representation  has 
many  similarities  to  the  variable  domain  array.  Notice 
the  correspondence  between  edge  values  in  the  dual 
graph  and  the  zero  vector  values  in  the  variable  do¬ 
main  array.  Yet  the  dual  graph  introduces  many 
unique  efficiencies  not  available  in  the  array.  These 
will  be  described  more  fully  later.  Both  of  these 
representations  are  limited  to  rectangular  approxima¬ 
tions  of  more  complex  shapes.  Both  are  also  easily 
extended  to  three  dimensions. 
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(a)  Orthographic  Drawing 


(b)  Variable  Domain  Array 


Figure  1  Spatial  Representations 


(cj  Dual  Colored  Graph 


More  recently,  Charles  Pfefferkorn  developed  a 
general  two-dimensional  representation  in  DPS,  as 
part  of  his  Ph.D.  thesis  l1  6I .  It  consisted  of  a  set  of 
convex  domains,  each  described  in  terms  of  its  perim¬ 
eter  edges.  The  map  of  the  data  structure  used  for  a 
single  element  is  shown  in  Figure  2.  A  new  domain  is 
added  by  entering  its  edges  one  at  a  time  to  partition 
the  current  domains.  This  representation  may  spatial¬ 
ly  characterize  any  two-dimensional  shape  and  checks 
overlaps  by  restricting  the  partitioning  of  domains  to 
those  that  are  empty. 

Each  of  these  representations  has  associated  with 
it  facilities  for  describing  objects  and  spaces,  and 
operators  for  generating  arrangements.  While  each  is 
conceptually  quite  simple,  each  provides  quite  dis¬ 
tinct  capabilities  when  particular  types  of  problems 
are  considered.  In  terms  of  a  two-dimensional  repre¬ 


sentation,  Pfefferkorn's  is  general  and  provides  the 
capabilities  needed  for  CAD.  Only  integrating  the 
representation  of  objects  in  a  wa>'  that  compliments 
the  treatment  of  constraints  -  as  Grason's  GRAMP/ . 
has  done  -  would  be  an  improvement. 

All  problem  formulations  used  to  date  have  be  n 
in  terms  of  constraints,  that  is,  tests  which  return  a 
Boolean  predicate.  These  tests  may  reflect  technologi¬ 
cal  requirements  (maximum  distance  between  a 
memory  box  and  CPU),  public  safety  or  building 
code  criteria  (width  of  a  stairway),  or  good  design 
practice  (all  offices  should  have  windows*.  In  the 
most  general  case,  these  constraints  are  Boolean  func¬ 
tions  of  unlimited  complexity  and  undefined  internal 
structure.  Constraints  regarding  adjacency,  access,  dis¬ 
tances,  sightlines,  and  orientation  have  been  imple¬ 
mented  in  this  fashion. 


attribute 

LIST 


>r 


Figure  2  A  Map  of  a  Single  Object  Description  in  Pfefferkorn's  DPS1 1 6) 


It  was  quickly  learned  that  at  least  two  structural¬ 
ly  distinct  kinds  of  constraints  were  involved  in  de¬ 
sign  problems.  Consider  a  constraint  regarding  ad¬ 
jacency.  Once  two  elements  are  adjacent  they  will 
only  become  not  adjacent  if  one  of  the  adjacent 
elements  is  moved  relative  to  the  location  of  the 
other.  Consider  now  a  sightline  constraint  between 
two  elements.  The  relocation  of  any  element  may 
alter  the  value  of  this  constraint.  We  call  the  first  type 
local  and  the  second  global  19) .  Local  constraints  are 
much  easier  to  deal  with;  they  need  be  tested  only 
when  an  object  which  is  a  predicate  of  the  test  is 
altered.  Corrective  operations  when  a  local  constraint 
fails  are  also  much  simpler  to  diagnose.  It  seems 
possible  in  many  cases  to  redefine  global  constraints 
so  that  they  become  local,  without  loss  of  generality. 
For  example,  once  a  sightline  required  between  two 
locations  is  satisfactory,  the  program  may  assign  the 
space  required  to  be  clear  between  them  as  (in  a 


sense)  solid.  No  other  objects  can  then  be  located 
there  and  the  test  need  not  be  repeated.  Development 
of  a  general  set  of  local  constraints  is  an  important 
objective  in  both  computer-aided  and  automated 
design. 

Each  of  these  Boolean  functions  can  be  computa¬ 
tionally  expensive.  Moreover,  it  seems  unlikely  that 
one  can  define  a  reasonably  small  set  of  tests  that 
would  satisfactorily  define  different  design  problems, 
even  if  they  were  all  limited  to  a  restricted  domain. 
An  alternative  approach  was  incorporated  into 
Grason's  GRAMPA.  The  properties  of  the  dual  graph 
have  an  interesting  relation  to  a  particular  set  of 
spatial  constraints.  Specifically,  there  is  a  one-to-one 
correspondence  between  many  constraints  and  single 
or  small  sets  of  variables  within  the  dual  graph  repre¬ 
sentation.  Adjacency,  in  the  general  sense,  is  denoted 
by  the  existence  of  an  edge.  The  value  of  an  edge 
denotes  the  length  of  common  br  der  among  ad- 


jacent  elements.  Dimensions  of  a  space  are  denoted 
by  the  sum  of  edges  of  one  direction  and  color 
attached  to  a  node.  Orientation  is  denoted  by  color, 
In  this  representation,  a  problem  is  defined  as  a 
partially  specified  graph.  A  solution  is  a  complete 
planar  graph  satisfying  constraints  regarding  the  value 
and  ordering  of  edges  to  a  node.  This  representation 
reduces  constraint  testing  to  triviality,  but  with  the 
added  cost  of  a  more  complicated  evaluation  of 
planar  feasibility. 

A  third  approach  for  dealing  with  constraints  is 
called  Constraint  Projection.  Instead  of  a  Boolean 
function  for  each  constraint,  the  system  incorporates 
procedures  for  defining  fhe  spatial  domains  of  feasi¬ 
ble  locations  and  the  corresponding  range  of  feasible 
orientations.  Multiple  constraints  are  treated  by  de¬ 
fining  the  domain  and  orientation  range  for  each 
constraint,  then  the  appropriate  set  function,  combin¬ 
ing  them  to  result  in  a  final  feasible  domain.  Thus,  a 
set  of  constraints  can  be  reduced  (without  ever  apply¬ 
ing  them  to  the  arrangement)  to  a  single  one.  Reduc¬ 
tion  is  a  very  desirable  capability  for  automated  de¬ 
sign  systems,  as  it  is  for  other  types  of  problem 
solvers. 

Each  of  the  above  methods  of  treating  constraints 
imposes  strict  restrictions  on  the  design  representa¬ 
tion.  Generality  of  the  shapes  characterized  by  a 
representation  is  only  the  first  criterion  in  the  devel¬ 
opment  of  data  structures  for  CAD.  Another  issue  is 
the  base  language  for  its  ir.p'ementation.  Yessios  has 
explored  a  range  of  data  structures  for  computer- 
aided  design,  their  specifications  regarding  shape  and 
arrangement,  and  various  grammars  for  comb.ning 
them  I20-21 1  ,  His  work  can  be  considered  in  two 
different  but  equally  valuable  perspectives.  Ore  is 
that  these  languages  will  provide  the  primitives  for 
higher  level  CAD  systems;  this  is  a  traditional  perspec¬ 
tive.  The  second  view  is  that  a  major  task  in  design  is 
translation.  A  design  problem  first  is  an  existence 
question  regarding  the  mapping  of  a  set  of  statements 
in  one  representation  into  a  spatial  one.  If  a  mapping 
exists,  e.g.,  the  design  is  feasible,  then  the  iterative 
step  is  one  of  (a)  examining  the  spatial  realization  of 
the  first  design  problem  and  redefining  the  original 
problem  statement  based  on  this  new  information,  or 
(b)  applying  more  complex  analyses  to  the  statement 
and  using  the  results  to  generate  a  new  problem 
statement.  This  second  view  marks  an  advance  in  the 
conception  of  man-machine  organization,  for  it  parti¬ 
tions  design  tasks  according  to  the  formal  definition 
of  their  complicatedness,  c.g.,  those  problems  that  are 
syntactically  resolvable  within  restricted,  grammars 
and  all  others. 


Expansion  of  the  Contributions  of  the  Computer 
to  Design 

The  description  by  Coons  at  the  beginning  of  this 
paper  implicitly  p  etitions  the  tasks  between  man  and 
machine.  The  machine  does  analysis  and  numerical 
studies;  the  man  uses  his  "creativity"  to  solve  design 
problems.  This  a  priori  conception  of  the  two  part¬ 
ners'  contribution  is  too  limited  We  beiieve  that  a 
computer  has  at  least  the  potential  for  providing  the 
same  skills  as  a  "dumb"  draftsman  and  that  some 
analyses  will  forever  remain  the  province  of  visual 
examination.  In  the  former  case,  the  computer  should 
be  able  to  respond  to  the  description  of  simple,  well 
structured  design  problems  and  generate  solutions  for 
them.  It  should  be  able  to  modify  its  solutions  as  new 
information  is  received  from  the  designer  "looking 
over  its  shoulder". 

A  good  portion  of  our  research  has  focused  on  the 
automatic  generation  of  the  spatial  arrangement  of 
physical  elements.  The  general  formulation  is  given: 
s  :  :  =  a  space,  bounded  or  un¬ 

bounded; 

bi  ,b2, .  .  .  bm  :  :  =  a  set  of  elements  of  fixed 

or  variable  shapes; 

ci  ,c2.  •  •  ■  cn  :  :  =  a  set  of  constraints  defin¬ 

ing  required  relations  be¬ 
tween  two  or  mo  e  ele¬ 
ments  and  the  shape  of 
single  elements; 

dj  ,d2 ,  .  .  .  dp  :  :  =  a  set  of  operators  for  map¬ 

ping  elements  into  the 
space  in  different  ways 
and  possibly  for  altering 
their  shape; 

e°  ::=  an  initial  arrangement, 

which  may  simply  be  s; 

find;  (e',b'+ 1  ,d'+ 1 )  -*  (e'v  ;l  ei+1  <>  (c,  ,c2, .  .  .  cn)} 

where  *=•  is  a  matching  operation.  This  formulation 
presents  space  planning  as  a  state  space  problem  in¬ 
volving  a  search  through  the  ubiquitous  OR  tree.  The 
task  is  the  efficient  search  of  this  tree,  in  contrast 
with  other  heuristic  search  tasks,  at  least  four  unique 
issues  are  involved  in  the  above  formulation: 


a.  Location  operators  —  if  there  is  more  than  one, 
there  are  a  countably  infinite  number  of  locations 
for  any  element  within  a  space.  Which  subset  of 
locations  is  worth  considering  at  any  state  of  a 
design  problem,  that  is,  how  should  the  location 
operators  be  specified?  Manual  design  gives  no 
direct  answer  to  this  problem, 

b.  Similarly,  there  may  be  countably  infinite  shapes 
satisfying  the  shape  constraints  of  an  element,  but 
far  rewer  when  all  are  considered  in  a  single  ar¬ 
rangement.  What  shape  operations  are  effective  in 
finding  this  subset,  and  how  should  they  be  com¬ 
bined  with  the  location  operations? 

c.  Given  effective  operators,  what  search  strategy  is 
most  likely  to  lead  to  a  solution  quickly,  with 
minimal  states  being  generated? 

d.  Given  the  large  number  of  variables  required  to 
describe  a  state  (six  for  location  of  each  object  in 
3-space  plus  an  undefined  number  for  its  shape), 
what  bookkeeping  procedures  are  most  effective  in 
guaranteeing  that  search  will  proceed  without 
looping? 

Each  of  these  issues  has  received  attention  in  CAD 
research  here  at  CMU. 

The  location  problem  has  been  treated  by  a  variety 
of  heuristic  methods  and  one  exact  one,  Pfefferkorn's 
DPS,  for  instance,  identifies  each  convex  corner  of 
the  empty  space  as  a  possible  location  and  places  its 
reference  on  a  list  to  try  I1®! .  Grason's  program  tries 
adjacencies  (of  rectangular  objects)  with  corners 
aligning  I12l.  Both  of  these  are  heuristic.  Constraint 
Projection  provides  an  exact  method  for  dealing  with 
the  location  problem  .  It  derives  a  reduced  do¬ 
main  from  the  set  of  domains  characterizing  the  con¬ 
straints  of  an  element.  The  reduced  domain  depicts  a 
homogeneous  region  within  which  any  location  satis¬ 
fying  the  orientation  requirement  is  equally  ac¬ 
ceptable. 

The  shape  definition  problem  is  the  feasible  solu¬ 
tion  to  two  sets  of  constraints,  one  set  defining  the 
"internal"  and  constant  requirements  delimiting  ac¬ 
ceptable  shapes,  and  another  imposed  exogenously  on 
this  set  by  context,  delimiting  the  locations  the  shape 
may  occupy.  Two  types  of  shape  generation  opera¬ 
tions  have  been  tried.  The  first  was  initially  imple¬ 
mented  by  Sutherland  in  SKETCHPAD  and  consisted 
of  a  set  of  (possibly  non-linear)  equations  specifying 
properties  of  the  set  of  points  used  to  define  the 


perimeter  of  an  object.  Whenever  a  new  context  de¬ 
limits  the  location  of  one  or  more  points,  the  shape  is 
redefined  using  a  least-squares,  iterative  convergence 
method  .  We  at  CMU  have  explored  an  alterna¬ 
tive  method  of  variable  shape  definition  based  on 
generative  assumptions.  Using  a  primitive  which  en¬ 
larges  a  portion  of  an  object  so  that  the  resulting 
form  satisfies  a  group  of  (variable)  tests  within  the 
primitive,  we  have  been  able  to  develop  sequences  of 
calls  to  this  expansion  operator  so  that  one,  or  a 
whole  set,  of  variable  shaped  elements  are  formed 
that  satisfy  both  internal  and  relational  criteria.  The 
sequence  of  expansion  is  agnin  a  tree  (AND  -  OR)  and 
the  objective  is  to  search  it  with  minimal  backtrack¬ 
ing.  Our  efforts  have  been  directed  toward  pipe  and 
duct  layout,  circulation,  and  room  arrangement  I101 . 

Given  the  large  set  of  variables  which  describe  a 
design  and  the  complex  relations  among  some  of 
them,  an  important  question  arises  concerning  the 
general  method  for  bounding  them  that  will  fulfill  a 
set  of  constraints  imposed  by  a  user.  As  described 
earlier,  this  question  has  been  formulated  within  a 
state-space  heuristic  search  representation. 

Any  OR-tree  is  easily  considered  as  a  Boolean 
function.  Using  minimal  assumptions  regarding  the 
final  distribution  of  elements,  we  have  developed 
search  decision  rules  which  minimize  the  cost  of 
evaluating  this  form  of  Boolean  function.  The  search 
is  efficient  in  finding  a  solution  if  one  exists;  this  is 
the  criterion  driving  the  search  process.  But  if  no 
solution  exists,  our  procedures  resort  to  implicit 
enumeration  and  may  waste  much  time  fruitlessly 
9  .  Currently,  we  are  trying  to  develop  a  practical 
failure  criterion. 

A  very  large  number  of  variables  is  required  to 
describe  any  state  in  CAD.  In  order  to  guarantee  that 
a  program  does  not  generate  equivalent  states  and 
therefore  loop,  some  trace  of  past  states  is  required. 
A  single  general  approach  has  been  used  in  the  pro¬ 
grams  developed  at  CMU,  with  different  variations. 
All  have  only  considered  arrangement  variables  with 
no  shape  variation  and  are  based  on  an  assumption  of 
a  depth-first  search.  Given  a  lexiographic  ordering  of 
locations  for  each  element  and  a  fixed  sequence  for 
manipulating  each  element  (corresponding  to  a  level 
in  the  tree)  a  pointer  to  the  current  location  of  each 
element  defines  both  the  current  state  and  all  others 
that  have  been  considered. 
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nprf  !Juk0rn  !e  led  precise|V  on  this  technique  in 
tdv  , When  initially  considering  each  element  a 
IRY-hst  was  generated  and  ordered  heuristically 
Backtracking  requires  only  a  pointer  to  the  current 
location  of  each  element  and  an  ordering  of  the 
elements.  Different  orientations  of  an  element  were 
tried  at  each  location.  Eastman's  program  relied  on 
location  operators  which  automatically  generate  a 
single  next  alternative  in  a  lexiographic  order.  Book- 
keeping  requires  that  each  operator  internally  identi¬ 
fy  whether  or  not  it  is  able  to  define  a  location,  and 
that  the  program  keep  track  of  the  first  location 
generated  when  the  process  is  moving  down  the 
search  tree. 

These  types  of  approaches  greatly  simplify  the 
state  description  hut  lead  to  other  complications.  In 
particular,  the  location  operators  we  have  used  gener¬ 
ate  different  locations  for  each  arrangement  of  ele¬ 
ments.  This  means  the  TRY-list  must  be  regenerated 
each  time  an  element  higher  in  the  search  tree  is 
relocated.  This  is  done  in  both  of  the  above  programs 
But  it  also  means  that  different  orders  of  objects 
generate  different  locations  and  thus  result  in  differ¬ 
ent  search  trees.  Eastman's  GSP  does  not  allow  ele¬ 
ment  reordering  and  is  limited  to  searching  arrange¬ 
ments  resulting  from  the  program's  estimation  of  he 
most  efficient  ordering.  Pfefferkorn's  allows  limited 
reordering. 


The  Direction  of  Future  Resi  arch 
in  Computer-Aided  Design 

Few  of  the  problems  reviewed  above  have  been 
completely  resolved  We  have  only  begun  to  consider 
the  requirements  for  CAD  systems  implied  by  analy- 
ses  of  the  tasks  of  design. 

Current  research  is  proceeding  in  a  variety  of  areas 
described  above,  including  data  structures  for  three- 
dimensional  objects,  the  development  of  a  constraint 
anguage  for  describing  any  kind  of  spatial  relation- 
hip  between  elements,  and  problem  decomposition 
In  addition,  we  are  exploring  alternative  methods  for 
bound,"9  ,he  search  process  in  large  arrangement 
P  }lems.  That  is,  when  should  a  program  "give  up" 
ookmg  for  (a)  feasible  arrangement(s).  Two  ap¬ 
proaches  to  the  bounding  problem  show  merit  The 


fust  is  to  use  information  found  in  a  partial  enumera¬ 
tion  of  the  tree  to  generate  a  proof  that  a  solution 
cannot  exist  in  other  parts.  This  requires  that  axioms 
be  induced  from  a  set  of  failed  search  states  For 
example,  in  Figure  3  it  is  intuitively  easy  to  see  that  if 
the  sum  of  the  areas  of  A,B,C,D,  is  smaller  than  X  + 
Y  but  greater  than  X,  then  one  or  more  of  the  objects 
must  fit  in  space  Y  for  an  arrangement  to  be  feasible 
We  are  exploring  how  arithmetic  analysis  over  various 
partitions  of  the  problem  space  may  be  used  to  guide 
and  bound  search  in  CAD. 
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Figure  3 


A  second  approach  to  bounding  search  is  in  terms 
of  cost  effectiveness.  Can  the  probability  of  finding  a 
solution  be  dynamically  estimated  as  search  proceeds 
to  allow  derivation  of  an  expected  cost  of  search?  If 
so,  this  also  would  be  an  effective  criterion  for  stop¬ 
ping  search  after  partially  enumerating  the  tree  of 
possibilities. 

Computer-aided  design,  particularly  when  it  in¬ 
cludes  synthesis  capabilities  and  spatial  considera¬ 
tions,  has  a  richness  of  issues  possibly  unparallelled 
among  the  problems  now  being  investigated  by  the  Al 
community.  Moreover,  results  have  many  applica 
tions,  including  the  direct  ones  for  CAD,  but  also  for 
robotology  (representations  of  the  physical  environ¬ 
ment,  the  planning  of  manipulation  tasks)  in  both 
space  and  industrial  applications.  The  design  implica¬ 
tions  range  from  architecture,  to  computer  design,  to 
regional  land  use  planning,  to  controlling  pollution 
effects.  We  at  CMU  expect  to  continue  our  program 
of  research  in  augmentation  of  the  design  process 
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On  the  Scheduling  Aspects  of 
Timing  Concurrent  Processes 

A.  N.  Habermann 

Introduction 

The  design  and  development  of  operating  systems 
has  enriched  computer  science  with  interesting 
studies  on  control  and  data  structures.  Two  such 
contributions  are  the  studies  of  phenomena  associ¬ 
ated  with  concurrency  and  of  the  design  and  imple¬ 
mentation  of  scheduling  strategies. 

The  purpose  of  this  paper  is  to  examine  briefly  the 
impact  of  scheduling  on  programming  timing  con¬ 
straints  in  concurrent  processes.  A  variety  of  aspects 
related  to  this  topic  have  been  discussed  in  a  series  of 
papers  and  reports  in  recent  years;  tnis  paper  reviews 
and  summarizes  the  overall  result  of  this  work. 

The  first  section  shows  what  sort  of  flexibility  is 
desirable  in  programming  timing  structures  on  behalf 
of  process  scheduling.  In  the  next  section  two  general 
timing  structures  are  discussed  that  allow  imple¬ 
mentation  of  arbitrary  scheduling  rules.  Subsequently 
some  of  the  verification  methods  are  reviewed  that 
are  based  on  using  the  properties  of  timing  rules  and 
the  structure  of  control  programs.  Finally,  the  class 
of  problems  is  considered  that  asks  for  an  imple¬ 
mentation  of  priority  rules  by  means  of  timing  struc¬ 
tures.  A  recent  study  showed  that  these  problems  can 
be  solved  by  means  of  one  unifying  principle  of 
representation. 

Timing  Concurrent  Processes 

The  fact  that  concurrent  processes  share  resources 
in  the  form  of  devices,  programs,  and  data  gives  rise 
to  possible  conflicts  of  interest.  Dijkstra  has  shown 
how  such  conflicts  can  be  resolved  using  critical  sec¬ 
tions  I5!  .  It  was  shown  in  a  series  of  papers  that 
these  can  be  implemented  using  the  "read/write 
cycle"  of  a  machine  as  the  most  elementary  critical 


section.  Critical  sections  of  arbitrary  length  are  often 
programmed  by  means  of  two  simpler  critical  sec¬ 
tions:  one  at  the  beginning  guarding  the  entrance  and 
one  at  the  end  controlling  the  exit.  The  function  of 
the  simple  one  at  the  entrance  should  be  to  grant  or 
deny  its  caller  permission  to  enter.  The  function  of 
the  one  at  the  exit  should  be  to  record  that  a  process 
is  leaving,  and  to  grant  entrance  permission,  if  pos¬ 
sible,  to  one  or  more  of  the  processes  which  was 
denied  earlier. 

Because  of  this  general  structure  it  seems  appropri-  17 
ate  to  devise  two  standard  critical  sections,  one  for 
entran  .e  and  one  for  exit,  and  to  use  these  for  pro¬ 
gramming  critical  sections  of  arbitrary  length.  Various 
proposals  to  this  effect  have  been  considered  and  a 
variety  of  such  "primitive  critical  sections"  have  been 
implemented.  There  are  even  proposals  to  base  the 
whole  timing  issue  on  such  primitives  HI.  Repre¬ 
sentatives  of  two  major  categories  are  the  operations 
LOCK  and  UNLOCK  and  P,V  operations  16)  _ 

An  advantage  of  primitives  is  that  a  waiting  process 
does  not  waste  ..ny  time  of  a  processor  that  it  possi¬ 
bly  shares  with  the  very  process  that  will  wake  it  up. 
Another  advantage  of  P,V  operations  is  that  these  can 
easily  be  extended  to  handle  critical  sections  of  which 
several  may  be  executed  simultaneously  (Dijkstra's 
counting  semaphores),  whereas  LOCK  and  UNLOCK 
do  not  allow  such  an  extension.  Finally,  a  difference 
between  the  two  (which  might  not  be  seen  as  an 
advantage)  is  that  the  order  in  which  processes  pass  a 
P  operation  is  fixed  by  the  chosen  implementation,  so 
programs  could  rely  upon  that  order,  whereas  it  is 
hard  to  predict  which  process  will  pass  a  LOCK 
operation  when  several  are  trying  to  do  so. 

The  use  of  standard  primitives,  however,  is  ab¬ 
solutely  inadequate  for  large  critical  sections  such  as 
those  needed  for  allocation  and  use  of  resources. 

Using  the  primitives  is  inappropriate,  generally  speak¬ 
ing,  if  the  situation  has  one  of  the  following  three 
characteristics’ 

1.  it  matters  which  process  is  selected  when  one  of 
seve-al  is  considered  for  entrance  permission; 

2.  it  may  not  be  wise  to  grant  permission  because  of 
a  possible  deadlock; 

3.  the  decision  to  grant  permission  may  be  regretted 
if  later  permission  must  be  withheld  from  a 
process  for  which  entering  is  more  urgent. 


An  example  of  resource  management  illustrates 
such  characteristics.  Suppose  ten  identical  magnetic 
tape  drives  are  pooled  among  three  types  of  proc¬ 
esses: 

P  type  processes  need  one  tape  unit  at  a  time; 

Q-type  processes  need  two  units  duiing  some 
period  of  time  (eg.,  foi  copying); 

R-type  processes  need  three  units  during  some 
period  of  time  (e.g.,  for  updating  or  tape  cor¬ 
rection). 

We  spot  easily  the  deadlock  which  will  occur  if,  for 
instance,  five  R-type  processes  should  succeed  in 
seizing  two  drives  each  I101 .  Also,  it  may  be  wise  to 
18  select  a  process  from  the  piocesses  waiting  for  tape 
cliives  based  on  an  external  priority  and  the  number 
of  drives  it  already  has  in  use.  But  when  such  a 
selection  strategy  is  implemented,  another  problem 
may  arise,  namely,  that  of  permanent  blocking  l1R!; 
for  example,  an  R-type  process  may  never  be  selected 
because  a  P  type  or  Q-type  piocess  happens  to  bo 
waiting  at  all  times.  Finally,  the  decision  to  giant  die 
last  free  drive  to  a  newly  arrived  R-type  process  n  ay 
be  regretted  if,  shortly  afterwards,  another  R  t'  pe 
process  requests  its  third  drive.  It  is  not  surprising 
that  the  impact  of  scheduling  on  timing  structures  is 
not  always  correctly  appreciated  1 1  7 I . 

LOCK  and  UNLOCK  woulcl  not  provide  any 
means  of  dealing  with  these  issues.  Programming  the 
use  of  drives  within  a  critical  section  P(drives)  - 
V(drives),  where  “drives"  has  the  initial  value  10, 
would  mean  that  special  algorithms  (or  dealing  with 
the  particular  circumstances  would  have  to  be  pro¬ 
grammed  as  part  of  the  P.V  operations  themselves. 
Thus,  extranolating  to  other  situations,  there  w^uld 
be  a  need  for  as  many  versions  of  P,V  operations  as 
there  are  different  circumstances,  but  this  is  in  con¬ 
flict  with  the  idea  of  standard  primitives. 

On  the  other  hand  there  is  an  important  type  of 
critical  section  for  which  a  standard  implementation 
with  primitives  is  perfectly  adequate.  This  is  the  type 
for  which  the  probability  that  more  than  one  process 
will  be  waiting  when  an  earlier  one  reaches  the  exit  is 
close  to  zero.  We  will  use  in  this  paper  the  attribute 
"small"  for  a  critical  section  that  has  this  property. 
Small  critical  sections  are  usually  short  pieces  of  code 
without  potential  delay  or  repetition  with  a  large  or 
unknown  bound.  Any  of  the  primitives  is  suitable  for 
programming  small  critical  sections,  bi  t  P,V  opera¬ 
tions  are  still  preferable  when  a  CPU  is  multiplexed 
16]  . 


Cooperating  Processes 

The  term  "cooperating"  describes  the  situation 
that  a  process  P  may  have  to  wait  until  another 
process  signals  the  occurience  of  an  event  E.  This  is  a 
fairly  common  relation  between  processes,  e. g.,  when 
processes  communicate  or  when  one  process 

controls  another.  Processes  related  through  common 
critical  sections  can  even  be  viewed  that  way,  because 
once  a  process  has  entered  a  critical  section,  it  must 
cause  the  event  of  leaving  it  before  another  process 
can  get  permission  to  enter. 

Cooperation  of  processes  can  be  described  in  terms 
of  operations  "wait"  and  "signal"  that  operate  on 
eventnames.  It  is  possible  to  implement  these  opera¬ 
tions  as  P,V  operations,  but  we  must  realize  that  the 
delay  in  such  a  P-operation  is  even  less  predictable 
than  when  used  to  enter  a  critical  section.  We  must 
assume  that  the  process  that  waits  on  an  event  cannot 
find  out  when  another  process  will  signal  the  occur¬ 
rence  of  that  event;  it  may  not  even  know  from 
which  process  a  signal  can  be  expected.  This  means 
that  standard  primitives  are  also  inadequate  for  imple¬ 
menting  wait  and  signal  operations  because  of  prob¬ 
lems  with  selection,  deadlocks  and  regrettable  deci¬ 
sions. 

It  has  been  shown  that  these  problems  can  general¬ 
ly  be  solved  by  applying  "private  eventnames"  1121 . 
The  attribute  "private”  means  that  an  eventname 
E[i]  associated  with  a  process  P [ i ]  will  exclusively  be 
used  by  other  processes  to  signal  P|i] ,  whereas  P|i]  is 
the  only  process  that  will  ever  wait  on  the  occurrence 
of  E[i] .  The  point  about  private  eventnames  is  that 
the  selection  problem  is  entirely  separated  from  the 
implementation  of  the  wait  operation,  because,  no 
matter  how  long  a  delay  is  caused  by  wait  ( E  [  i  ] ) , 
there  is  only  one  process  that  ever  will  wait  on  the 
occurrence  of  E  [i]  and  thus,  this  is  the  only  one  that 
could  be  selected! 

Greater  flexibility  for  implementing  the  necessary 
scheduling  is  now  achieved  by  either  one  of  the 
following  methods'. 

1.  implement  wait  and  signal  not  as  P,V  operations, 
but  as  critical  sections  in  which  scheduling  can  be 
programmed  as  needed; 

2.  have  the  processes  involved  send  requests  and 
completion  notices  to  a  controlling  agent  that  acts 
as  a  policeman  regulating  the  traffic  according  to 
well  established  rules. 


Note  that  the  second  method  does  not  eliminate  our 
task  of  programming  cooperation  among  processes;  it 
only  moves  the  interaction  from  pairs  of  processes 
having  equal  rights  to  individual  members  of  the 
community  that  must  cooperate  with  a  central 
agency.  (It  is  the  difference  between  placing  STOP 
signs  or  traffic  lights  at  a  street  intersection.) 

When  the  first  method  is  applied,  entrance  is  pro 
grammed  as  a  small  critical  section  followed  by  a  wait 
operation  on  the  private  eventname  of  its  caller.  The 
program  within  this  critical  section  is  small  enough  to 
allow  the  use  of  P,V  operations  for  delimiting  it.  Its 
function  is  to  investigate  whether  the  process  can 
enter;  if  not,  the  process  will  be  delayed  in  the 
subsequent  wait  statement.  Exit  is  also  programmed 
as  a  small  critical  section  for  which  P,V  operations  are 
adequate  open-  and  close-brackets.  Its  function  is  to 
see  whether  the  change  of  state  •  •  by  leaving, 
would  allow  one  of  the  processes  that  is  waiting  on  its 
private  eventname  (if  any)  to  continue. 

It  has  been  shown  that  such  constructs  as  entrance 
and  exit  behave  as  P,V  operations  and  so  these  are 
certainly  sufficient  to  implement  arbitrary  critical 
sections  I12'.  But  the  great  advantage  gained  over 
plain  P,V  operations  i'  that  we  have  not  committed 
ourselves  to  the  progra  ns  for  permission  or  selection 
and,  thus,  we  have  not  made  decisions  which  are 
unnecessary  for  implementing  large  critical  sections. 
The  only  restriction  on  programming  permission  and 
selection  is  that  the  critical  sections  for  entrance  and 
exit  should  be  small  (in  the  technical  sense  of  this 
paper). 

The  second  method  of  dealing  with  selection, 
deadlocks,  and  priority  rules  by  means  of  a  central 
agent  is  appealing  because  it  seems  to  separate  those 
issues  nicely  from  the  structure  of  the  programs  to 
which  they  apply.  It  is  certainly  true  that  those 
programs  will  have  a  simpler  structure,  but  overhead 
is  likely  to  increase  due  to  the  additional  calls  on  the 
agent,  and  the  possible  need  to  reconstruct  lost  infor¬ 
mation.  A  process  may  have  to  call  the  agent  for 
various  reasons,  e.g.,  when  requesting  a  resource  and 
when  releasing  one.  At  the  place  where  the  agent  is 
called,  the  reason  for  calling  is  perfectly  well  known. 


However,  this  information  must  be  transmitted  ex¬ 
plicitly  with  the  call,  and  the  agent  must  find  out  for 
what  reason  its  services  are  required.  Thus,  informa¬ 
tion  that  is  present  is  lost  through  a  uniform  call  on 
the  agent  arid  must  be  reconstructed  when  the  agent 
is  activated.  In  order  to  preserve  the  idea  of  allowing 
the  environment  of  the  processes  to  handle  selection 
and  other  issues,  one  could  split  the  agency  into 
individual  agents  each  to  be  called  for  a  particular 
task.  This  indeed  seems  an  acceptable  solution  under 
some  circumstances  I7',  but  in  other  cases  such  a 
solution  is  not  feasible,  as  for  instance  in  case  of 
peripheral  device  control.  The  hardware  makes  it  nec¬ 
essary  that  only  one  agent  controls  a  peripheral  de-  19 
vice  and  it  must  regulate  all  requests  for  device  opera¬ 
tions. 

Working  with  an  agent,  however,  still  does  not 
remove  the  task  of  programming  timing  structure  for 
processes  of  unequal  rank,  because  cooperation  must 
then  be  programmed  between  the  processes  and  the 
agents.  It  seems  that  the  use  of  an  agent  is  to  be 
recommended  for  complicated  hierarchies  or  a  great 
variety  of  ranks  or  complicated  priority  rules.  But  it 
also  seems  worthwhile  to  pay  special  attention  to  the 
effect  of  timing  rules  on  the  cooperation  of  processes 
partitioned  into  a  small  number  of  fixed  ranks,  as  is 
the  case  with  an  agent  and  its  callers. 

Verification  of  Timing  Rules 

The  additional  complexity  caused  by  concurrency 
prohibits  a  straight-forward  extensior,  of  Floyd's 
method  of  inductive  assertions  to  concurrent 
processes  l19'.  The  number  of  states  to  be  con¬ 
sidered  explodes  even  for  trivial  systems.  Not  only  is 
there  the  problem  of  finding  the  right  assertion,  but  it 
could  easily  be  the  case  that  the  correctness  proof 
itself  is  much  longer  and  an  order  of  magnitude  more 
complicated  than  the  programs  involved  '20' . 

A  more  promising  method  was  found  in  the  same 
spirit  as  the  axiomatic  approach  for  proving  program 
correctness  l16'  and  proving  the  correctness  of  APL 
programs  I9' .  The  approach  that  these  methods  have 
in  common  is  to  exploit  the  structure  of  the  given 
programs  in  the  correctness  proof ,  In  Hoare's  system, 
properties  of  control  structures  are  expressed  in  the 
form  of  axioms  and  can  be  used  in  that  form  in  a 
correctness  proof.  In  Susan  Gerhart's  thesis,  prop¬ 
erties  of  APL  operators  are  formulated  precisely  and 
thus  lead  to  a  more  concise  and  tractable  correctness 
proof  I9' . 


Since  "the  state"  of  a  system  of  concurrent  proc-  its  application  clarified  considerably  the  relation  be- 
esses  is  a  rather  vague  notion  in  any  case,  it  makes  tween  the  problem  specification  and  its  solution,  with 
more  sense  to  show  that  there  is  an  abstract  repre-  the  result,  that  other  interpretations  of  the  problem 

sentation  which  has  certain  desired  properties  that  statement  could  be  analyzed  as  well.  The  result  is 

will  not  get  lost  when  going  to  more  detailed  versions.  nevertheless  rated  as  modest  mainly  for  two  reasons: 

Some  success  was  scored  in  this  way  with  respect  to  first,  the  proofs  are  rather  long,  and  second,  there  is 

properties  that  can  be  derived  from  timing  structures  no  precise  model  or  formal  description, 

in  concurrent  processes  t11-12!.  It  was  round  ihat  It  seems  not  satisfying  that  the  verification  is  so 

the  working  of  P(E)  and  V(E),  or  wait(E)  and  signal  much  lon9er  than  the  program  text  it  is  trying  to 

(E),  can  be  characterized  by  the  fact  that  a  certain  verifV.  The  cause  in  this  case  is  not  so  much  the 

relation  remains  invariant  under  these  operations.  The  number  of  stater  to  consider,  but  the  combinatorial 

relation  says  that  the  number  oi  times  permission  was  problem  of  showing  that  participating  processes  can- 

granted  to  continue  after  a  wait(E)  equals  the  mini-  riot  get  into,  or  remain  in,  certain  states  when  certain 
mum  of  the  number  of  attempted  wait(E)'s  and  the  events  happen.  So  this  problem  can  ultimately  be 

number  of  executed  signal ( E) 's  incremented  by  an  reduced  to  the  second  one:  the  lack  of  a  precise 

initial  constant.  model  or  formal  description. 

The  verification  method  using  the  invariant  rela-  Another  consequence  of  this  second  deficiency  is 

tion  was  applied  to  a  useful  communication  system.  that  °ne  is  never  sure  whether  or  not  the  proof  is 

Not  only  could  it  be  proved  that  the  communication  complete  and  exhaustive.  It  is  never  clear  what  may 

was  deadlock  free,  but  other  interesting  properties  be  assumed  as  obvious  and  what  must  be  proved.  In 

also  emerged  from  the  analysis.  For  example,  it  was  going  over  the  proof  one  must  be  reconvinced  each 

shown  that  senders  and  receivers  could  access  the  time  that  nothing  has  been  omitted.  A  solution  for 

bounded  communication  buffer  at  the  same  time  both  problems  may  be  found  in  recent  results  which 

without  getting  into  a  conflict  when  the  buffer  was  offer  a  representation  of  some  classes  of  timing  prob- 

empty  or  when  a  first  message  was  placed.  Moreover,  lems  in  an  abstract  model;  this  allows  a  more  precise 

a  natural  simplification  of  the  control  programs  was  anc*  concise  treatment.  A  brief  discussion  follows  in 
found  for  the  cases  that  either  the  group  of  senders,  *be  next  section, 
or  the  group  of  receivers,  or  both,  were  reduced  to 

one  process.  It  was  later  shown  that  the  invariant  An  Abstract  Model  for  Some  Timing  Structures 
could  also  be  used  to  verify  the  correctness  of  a  more  Considerable  activity  was  aroused  recently  by  the 

complicated  communication  system  in  which  senders,  Readers  and  Writers  Problem  l2l .  The  basic  charac- 
or  receivers,  or  both,  may  get  ahead  of  one  another  teristic  of  the  problem  is  to  implement  certain  pri- 

l24l .  ority  rules  by  means  of  a  timing  structure.  The  prob- 

The  property  verification  method  was  modestly  lem  is  that  two  groups  of  processes  perform  an  action 

successful  when  applied  to  the  Cigarette  Smokers  exclusively,  but  one  of  the  groups  has  preference  over 

Problem  t22I .  A  solution  of  this  problem  was  pre-  the  other,  or  more  precisely : 

sented  in  the  form  of  a  Petri  net  and  it  was  shown  1.  when  a  process  P  of  group  A  performs  action  A,  no 

that  this  problem  could  not  be  solved  with  a  re-  process  Q  of  group  Q  is  permitted  to  perform 

stricted  form  of  Dijkstra's  semaphores  without  the  action  A  in  an  overlapping  time  interval; 

use  of  some  form  of  conditional  statement.  But  a  2.  if  neither  a  P  nor  a  Q  is  performing  A,  any  one  of 

solution  using  semaphore  arrays  was  soon  found  I21 1  them  must  be  able  to  start  action  A; 

and  the  property  verification  method  was  applied  to  3.  (preference  rule)  if  there  are  processes  of  group  P 

that  solution  (131 .  The  method  proved  to  be  rather  waiting  to  perform  action  A,  at  least  one  of  these 

successful  in  that  a  generalization  of  the  problem  should  get  permission  to  do  so  as  soon  as  the 

could  be  proved  as  easily  as  the  given  one.  Moreover,  processes  currently  executing  A  are  finished. 


iflL'  1 ,i"  ii— « .  m.  ^iljijfipi  JEpi  0  VJV  1  S-V'-M^1^'  *|f»l.  |J/ JIJ4  I  |U  JWJ1  .A  f  WJl  '*  !' 1  ;  1'IW 


In  addition  to  these  rules  one  can  specify  whether 
or  not  processes  in  one  group  have  to  perform  action 
A  exclusively  among  one  another.  The  Readers  and 
Writers  Problem  is  stated  in  two  versions:  one  in 
which  preference  is  given  to  Readers  and  another  one 
in  which  Writers  have  priority  This  accounts  for  two 
of  the  four  possible  cas^s,  because  Readers  do  not 
have  to  perform  action  A  exclusively,  whereas  Writers 
do.  The  other  two  cases  are,  using  the  same  terminol¬ 
ogy:  two  groups  of  Readers  of  which  one  has  pri 
oiity,  or  two  Writer  groups  one  of  which  has  prefer¬ 
ence  over  the  other. 

In  the  original  paper  the  programs  for  Readers  and 
Writers  were  presented  and  their  correctness  was  rea¬ 
soned  in  an  informal  way  ^  .  Other  solutions  were 
proposed  in  which  an  attempt  was  made  to  design 
symmetry  in  the  programs  for  both  groups  I15'  .  but 
the  authors  of  the  o  iginal  paper  showed  the  in¬ 
adequacy  of  such  a  solution l . 

Since  solutions  of  such  problems  depend  on  timing 
structure,  the  invariant  for  P,V  operations  mentioned 
in  the  preceding  section  was  tried  to  show  that  the 
presented  programs  indeed  have  the  necessary  proper¬ 
ties  to  enforce  the  required  rule  of  preference.  The 
result  was  as  in  previous  cases:  analysis  clarified  some 
deficiencies  of  the  presented  programs,  but  involved 
rather  long  proofs  based  on  an  informal  model. 

More  recent  investigations  may  overcome  the 
earlier  difficulties.  These  go  quite  naturally  in  the 
direction  of  a  formal  model  in  which  timing  struc¬ 
tures  can  be  represented.  The  timing  structure  of  a 
process  is  represented  in  this  model  as  a  regular  ex¬ 
pression  in  which  the  terminal  symbols  are  brackets 
representing  the  wait  and  signal  operations.  The  tim¬ 
ing  rules  are  very  .simply  expressed  in  terms  of  state 
transitions  of  an  automaton  with  the  ground  rule  that 
only  one  process  at  a  time  can  cause  a  state  transition 
oy  pacing  a  bracket.  If  counting  semaphores  are  not 
considered,  the  additional  rules  are: 

1.  an  open-bracket  can  be  placed  at  any  time; 

2.  a  close-bracket  cannot  be  placed  unless  the  cor¬ 
responding  open-bracket  is  present; 

3.  when  a  close-bracket  is  placed,  it  cancels  out  the 
corresponding  open-bracket  and  both  are  deleted. 

Considering  counting  semaphores  means  that  a  mul¬ 
tiplicity  of  brackets  of  one  kind  is  allowed,  and  in 
particular  the  initial  state  of  the  automaton  may 
contain  several  open-brackets  of  one  kind.  Verifica¬ 
tion  of  properties  due  to  timing  structures  can  be 
carried  out  in  a  more  precise  way  using  this  model 
and  the  proofs  seem  to  be  significantly  shorter  for  all 
the  cases  mentioned  here. 


Nummary  and  Conclusions 

Implementation  of  timing  structure  in  concurrent 
processes  results  in  a  need  for  scheduling.  The  use  of 
standard  primitives  such  as  P,V  operations  is  inade¬ 
quate  in  circumstances  where  selection  of  a  process  is 
based  on  a  priority  measure,  deadlock  situations,  and 
likely  decisions  in  the  near  future.  Sufficient  flexi¬ 
bility  can  be  achieved,  however,  when  timing  opera¬ 
tions  are  programmed  as  combinations  of  small  criti¬ 
cal  sections  and  operations  on  private  eventnames. 

Programming  a  central  agent  that  performs  the 
scheduling  task  can  bring  about  more  clarity  in  the 
structure  of  the  system,  but  the  task  of  implementing 
cooperation  in  accordance  with  certain  preference 
rules  remains  present.  With  respect  to  such  an  organi- 
zatiun,  one  should  consider  the  sorts  of  scheduling 
which  can  be  achieved  by  timing  rules  for  a  small 
number  of  preference  classes. 

Combinatorial  explosion  prohibits  a  useful  applica¬ 
tion  of  the  notion  ''state”  as  a  composition  of  the 
states  of  the  individual  processes.  For  the  same  rea¬ 
son,  there  is  little  hope  that  Floyd's  inductive  asser¬ 
tions  method  can  be  usefully  extended  to  concurrent 
processes.  Instead,  a  more  promising  approach  seems 
to  be  to  prove  that  an  abstract  model  of  the  concur¬ 
rent  processes  has  certain  desired  properties  that  will 
not  get  lost  in  more  detailed  versions  of  the  programs 
for  these  processes.  Some  results  were  obtained  in 
this  way,  first  by  applying  an  invariance  rule  to  pro¬ 
grams  with  a  given  timing  structure,  and  more  recent¬ 
ly  by  means  of  an  abstract  model  for  the  timing 
structure  of  concurrent  processes  and  an  automaton 
that  simulates  their  behavior.  Satisfactory  verification 
appears  to  be  feasible  using  this  model  and  it  seems 
worthwhile  to  investigate  what  class  of  timing  prob¬ 
lems  can  be  treated  in  this  way. 
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Some  Practical  Uses  For  Analytic  Models 
m  the  Study  of  Computer  System  Performance 

John  W.  McCredie 

Introduction 

At  the  1973  national  meeting  of  the  ACM  Special 
Interest  Group  on  Measurement  and  Evaluation, 
authors  presenting  analytic  models  of  computer  sys¬ 
tems  were  under  constant  attack  from  a  large  group 
of  practitioners.  Two  labels  ("academics",  and  "those 

,he  ditches")  quickly  entered  the  local  jargon.  In 
puuiic  sess  ons  and  in  small  informal  groups,  debators 
presented  classical  arguments  about  the  relative 
merits  of  theoretical  and  empirical  studies.  The  focus 
of  these  discussions  was  the  analysis  of  computer 
system  performance,  but  many  of  the  arguments  had 
been  presented  time  and  again  in  other  domains.  The 
argument  that  overly  simplified  analytic  models  are 
misleading  was  countered  with  the  charge  that  reels 
of  experimental  data  without  an  underlying  theory 
are  useless.  The  "Scientific  Method"  presents  a  funda¬ 
mental  interaction  betv  een  theory  and  empiricism 
that  is  apparently  lacking  in  many  current  perform¬ 
ance  evaluation  studies. 

Two  reasons  why  "academic"  analytic  computer 
system  models  often  remain  unused  by  "those  in  the 
ditches"  are:  (a)  reports  describing  them  seldom  con¬ 
tain  discussions  about  their  validity  for  describing 
empirical  observations  and  (b)  often  the  results  are  so 
complicated  that  users  are  not  willing  to  invest  the 
time  needed  to  understand  the  model  and  its  be¬ 
havior.  The  main  purpose  of  descriptive  models  is  to 
account  for  observed  phenomena  of  physical  systems. 
However,  the  complexity  of  most  actual  systems  re¬ 
quires  that  any  particular  model  must  address  a  lim¬ 
ited  and  constrained  subset  of  state  variables.  Thus, 
each  model  is  an  abstraction  of  a  particular  set  of 
important  features  of  interest  to  an  analyst  or  de¬ 
signer.  Simplifications  required  to  make  an  abstrac¬ 


tion  manageable  by  a  particular  solution  technique 
limit  both  scope  and  power.  Since  analytic  models  are 
characterized  by  symbolic  formulations  and  deduc¬ 
tive  derivations,  they  require  many  simplifying  as¬ 
sumptions.  The  consequences  of  these  assumptions 
must  be  explored  before  one  applies  such  models. 
The  following  paragraph  outlines  some  of  the  general 
ways  analytic  models  may  be  useful  in  computer 
system  performance  analysis,  and  the  body  of  the 
article  contains  a  number  of  specific  analytic  ex¬ 
amples  developed  at  Carnegie-Mellon. 

Probably  the  two  most  common  techniques  used 
by  performance  evaluation  practitioners  are: 

(1)  the  design,  implementation,  and  analysis  of 
empirical  investigations; 

(2)  the  construction  and  use  of  specific,  large,  com¬ 
plex  simulation  models. 

These  two  methodologies  have  areas  of  applicability 
which  interact  with  those  of  analytic  models.  For 
example,  analytic  models  often  expand  to  the  point 
where  large  computational  effort  is  required  to  calcu¬ 
late  results.  Often  a  point  is  reached  when  a  modest 
simulation  may  be  a  more  cost-effective  approach. 
Large  simulations  may  grow  into  system  prototypes. 
Empirical  investigations  can  provide  insight  required 
to  design  better  models,  and  these  models  can  indi¬ 
cate  which  of  many  possible  parameters  or  sub¬ 
systems  are  good  candidates  for  more  detailed  study 
via  simulation  and  experimentation.  Other  important 
uses  for  analytic  models  are  as  reference  systems  to 
aid  in  both  the  debugging  and  statistical  analysis  of 
simulation  experiments. 

To  be  really  useful,  analytic  formulations  should 
include  the  essential  features  of  a  system,  or  sub¬ 
system,  and  should  have  solutions  that  are  readily 
understandable.  The  necessity  of  spending  excessive 
computer  effort  to  solve  for  each  parameter  value  of 
an  analytic  model  casts  doubt  upon  its  usefulness 
since  simulations  typically  can  handle  more  detailed 
cases  with  similar  effort.  The  conclusion  from  these 
considerations  is  that  analytic  models,  empirical  in¬ 
vestigations,  and  simulation  studies  should  comple¬ 
ment  one  another.  Each  technique  serves  a  useful 
purpose  when  applied  properly. 


The  qoal  of  this  article  is  to  illustrate,  with  three 
specific  examples,  that  even  though  most  analytic 
computer  system  models  are  highly  simplified  ab 
stractions  of  actual  systems,  they  can  be  very  useful 
in  peifoi  rnance  evaluation  studies.  The  first  example 
is  a  discrete-time  Markov  model  that  illustrates  the 
effects  of  pi  iority  scheduling  in  a  closed  cyclic  service 
network.  The  primary  uses  of  this  class  of  model  are 
for  educational  or  demonstration  purposes.  The 
second  model  is  a  modification  of  a  classic  multi¬ 
server  queueing  system.  This  model  is  helpful  n 
studying  different  scheduling  algorithms  for  load 
leveling  in  computer  networks.  The  final  model  al¬ 
lows  an  analyst  to  explore  economically  part  of  the 
large  design  space  of  a  multiprocessing  computer 
26  system  in  order  to  focus  attention  on  areas  which 
need  further  study. 

The  level  of  detail  in  the  following  paragraphs 
varies  with  each  model.  The  purpose  is  not  to  present 
detailed  derivations,  but  to  describe  the  structures  of 
three  different  types  of  models  and  to  summarise  the 
results  of  the  detailed  analysis.  Since  the  first  model 
is  not  too  complicated  the  interested  reader  should  be 
able  to  derive  the  results  presented  in  equations  (1) 
through  (7).  The  second  model  is  more  involved,  but 
the  reader  with  some  background  in  queueing  theory 
should  be  able  to  derive  equations  (8)  through  (10) 
with  little  effort.  The  last  model  is  an  application  of 
an  important  theorem  concerned  with  networks  of 
queues.  Both  the  classic  reference  for  this  theorem, 
and  the  details  of  the  theorem's  application  to  this 
model  are  rather  involved.  Thus  the  results  of  ’his 
analysis  are  presented  as  an  ALGOL  procedure  so 
that  the  interested  reader  may  use  the  model  directly 
and  then  check  the  derivations  from  the  references. 

Discrete  Markov  Model 

The  basic  concepts  of  a  Markov  process  are  system 
state  and  state  transition.  For  a  discrete-time  Markov 
process  it  is  convenient  to  assume  that  the  time 
between  transitions  is  a  constant  equal  to  unity.  Let 
there  be  N  states  in  the  system  numbered  from  1  to 
N.  Then  for  a  simple  Markov  process  the  probability 
of  a  transition  to  state  j  during  the  next  time  interval, 
given  that  the  system  now  occupies  state  i,  is  a 
function  only  of  i  and  j  and  not  of  any  history  of  the 
system  before  its  arrival  in  i.  Thus  one  may  specify  a 
set  of  conditional  probabilities,  pjj,  which  are  the 
probabilities  that  a  system  which  now  occupies  state  i 
will  occupy  state  j  after  its  next  transition.  The  transi¬ 
tion  matrix  for  a  Markov  process  is  the  N  by  N  matrix 
whose  elements,  pjj,  satisfy  the  following  equations. 


(1)  N 

Xp.j  1 

i  i 

(2)  0<Pij 

Consider  the  following  model  of  a  simple  multi 
programming  system.  At  every  point  in  time  there  are 
two  jobs  in  the  system  receiving,  or  waiting  for, 
service  from  one  of  two  subsystems:  (a)  a  central 
processing  unit  (Pc),  and  (b)  an  input/output 
(M.drum)  system  with  characteristics  similar  to  a 
drum,  The  entire  system  is  synchronized  to  schedule 
jobs  at  tine  end  of  every  timing  interval  which  is  equal 
to  one  revolution  of  the  drum.  The  two  jobs  which 
cycle  through  the  system  come  from  two  different 
priority  classes.  If  there  is  a  job  from  priority  class  1 
at  the  Pc,  the  probability  that  it  will  require  another 
interval  of  Pc  time  is  fl-ul)  and  the  probability  that 
it  will  make  a  request  to  the  drum  subsystem  is  ul. 
The  corresponding  probabilities  for  jobs  of  priority 
class  2  are  ( 7 -u 2)  and  u2.  If  a  job  makes  a  drum 
request,  it  is  blocked  from  additional  Pc  processing 
until  the  request  is  satisfied.  When  a  job  from  class  1 
is  receiving  service  from  the  M.drum  system,  the 
probability  that  it  will  require  another  interval  of 
service  is  (1-wl)  and  the  probability  that  it  will  finish 
and  return  to  the  Pc  for  additional  processing  is  wl. 
The  corresponding  probabilities  for  jobs  from  priority 
class  2  are  (1-w2)  and  w2. 

Whenever  a  job  completes  its  work  and  leaves  the 
system,  it  is  immediately  replaced  by  a  new  job  from 
the  same  priority  class  having  identical  parameters  ui 
and  wi.  Figure  1  illustrates  the  structure  of  this 
model. 


Figure  1  Discrete-Time  Markov  Model 


Define  the  following  configurations  as  the  states  of 
the  system: 

SI :  Both  jobs  are  requesting  service  from  the  Pc. 

S2:  Both  jobs  are  requesting  service  from  the 
M.drum  system. 

S3:  A  job  from  priority  class  1  is  requesting  service 
from  the  Pc  and  one  from  class  2  is  requesting 
service  from  the  M.drum  system. 

S4:  A  job  from  priority  class  2  is  requesting  service 
from  the  Pc  and  one  from  class  1  is  requesting 
service  from  the  M.drum  system. 

To  specify  the  system  completely  we  must  determine 
a  scheduling  rule  to  decide  which  job  will  receive 
service  from  a  subsystem  if  two  jobs  are  simultane¬ 
ously  requesting  service  for  the  next  service  interval. 
Assume  that  class  1  has  the  higher  priority  and  when¬ 
ever  two  jobs  are  waiting  for  service  the  one  from 
class  1  will  be  chosen  for  processing. 

The  following  matrix  contains  the  transition  prob¬ 
abilities  for  this  system.  Each  element,  pjj,  is  the 
probability  that  if  the  system  is  in  sta  e  i  at  the  end 
of  a  timing  interval  it  will  be  in  state  j  at  the  end  of 
the  next  interval. 


X 

1 

2 

3  4 

1 

1 

(1-ul) 

0 

0  ul 

2 

0 

(1— wl) 

wl  0 

3 

( 1— u1)w2 

u  1  ( 1  — w2) 

( 1  — u  1 )  ( 1  — w2)  u1w2 

4 

( 1— u2)w1 

u2(  1— wl) 

u2w1  ( 1  — u2)  ( 1  — wl ) 

The  steady  state  probabilities,  p,-,  that  this  system 
will  be  in  state  j,  after  a  large  number  of  transitions 
may  be  calculated  from  the  following  equations. 

N 

(3)  pj  =  2  p,  •  p|j  ,  j  =  1 . N 

i=1 

N 

(4)  2  Pj  -  1 
i=i 


One  may  now  eliminate  variables  so  that  all  of  the 
steady  state  probabilities  may  be  expressed  in  terms 
of  just  one  state  probability.  The  following  equations 
are  the  results  of  expressing  all  state  variables  of  this 
system  in  terms  of  p4 . 

(5)  pi  =  (u2  +  wl  —  u2w1  —  u1u2)  p4/u1 

(6)  p2  =  (u1u2  —  u1u2w2  +  u2w2  —  u2w1w2)  p4 

(w1w2) 

(7)  p3  =  u2p4/w2  i 

These  results  may  be  used  in  equation  (4)  to  deter-  27 
mine  p4  directly.  One  may  now  compute  various 
performance  parameters  of  the  system.  For  example, 

Pc  utilization  (the  probability  that  the  Pc  is  busy)  is 
Pi  +  p3  +  p4  =  1—  p2  and  M.drum  utilization  is  p2  + 

P3  +P4  =  1— Pi  • 

The  effects  of  different  scheduling  algorithms  may 
be  illustrated  with  this  model  by  assigning  different 
parameter  values  to  the  different  priority  classes.  As 
examples  consider  the  cases  of  "expected-shortest- 
job-f irst"  and  "expected-longest-job-first''  scheduling 
disciplines.  Since  jobs  from  priority  class  1  are  always 
processed  first  we  can  model  these  two  scheduling 
algorithms  by  properly  assigning  ul,  u2,  wl  and  w2. 

The  mean  number  of  service  quanta  a  job  will  receive 
from  the  Pc  and  M.drum  systems  are  1/ui  and  1/wi 
respectively.  If  ul  >  u2  the  Pc  will  process  the  job 
with  the  shorter  mean  service  request  when  both  jobs 
are  requesting  Pc  service.  If  ul  <  u2  the  Pc  will 
process  the  longer  request.  When  ul  =  wl  =  .5  and  u2 
=  w2  =  .1  (case  1)  Pc  utilization  and  M.drum  utiliza¬ 
tion  are  both  .75.  When  the  scheduling  is  reversed  by 
letting  ul  =  wl  =  .1  and  u2  -  w2  =  .5  (case  2)  the  Pc 
and  M.drum  utilization  drop  to  .58.  One  measure  of 
job  throughput  is  the  steady  state  probability  that  at 
the  end  of  a  timing  interval  a  job  will  be  leaving  the 
Pc  to  request  M.drum  service.  When  the  shorter  jobs 
are  given  high  priority  (case  1)  the  probability  is  .25 
that  a  class  1  job  will  be  completing  Pc  service  and 
.025  that  a  class  2  job  will  be  finishing.  When  the 
longer  jobs  are  given  high  priority  (case  2)  these 
probabilities  become  .05  and  .04.  Thus  in  the  latter 
case,  short  job  throughput  is  reduced  by  a  factor  of 
five,  total  throughput  is  reduced  by  a  factor  of  three, 
long  job  throughput  is  increased  by  sixty  percent,  and 
Pc  and  M.drum  utilization  decrease  by  more  than 
twenty  percent. 


► 


There  have  been  a  number  of  empirical  and  simula¬ 
tion  studies  that  have  demonstrated  the  effects  of 
"shortest  job-first"  scheduling  rules  in  multiprogram¬ 
ming  systems.  A  recent  article  by  Sherman,  Baskett, 
and  Browne  I6*  reviews  the  results  of  some  of  these 
experiments.  Using  a  trace  of  real  service  requests  as 
input,  they  built  a  simulation  model  of  an  actual 
multiprogramming  system.  They  found  no  counter 
example  to  the  hypotheses  that  the  "best"  way  to 
schedule  the  Pc  in  such  a  syst' i  to  give  it  to  the 
job  that  will  compute  for  the  shortest  period  of  time 
before  issuing  an  M.drum  request  and  the  "worst" 
way  is  to  give  the  Pc  to  the  job  that  will  compute  for 
the  longest  period.  The  objective  functions  used  for 
their  evaluation  studies  were  based  on  utilization  and 
28  throughput  measures.  They  did  not  consider  dispatch¬ 
ing  rules  which  delayed  tasks  when  resources  were 
available  to  process  them. 

The  model  developed  in  this  section  illustrates 
important  fundamentai  ideas  of  scheduling  theory  in 
a  way  that  is  easy  to  understand,  derive,  and  manipu¬ 
late.  The  model  may  be  easily  modified  to  become  a 
continuous-time  Markov  model  of  a  fully  preemptive 
scheduling  policy.  Many  other  illustrative  variations 
are  possible.  This  model  has  been  used  successfully  in 
classes  at  Carnegie-Mellon  as  the  focal  point  for  lec¬ 
tures  on  scheduling  and  as  the  basis  for  more  com¬ 
plicated  simulation  assignments. 

Scheduling  Model 

One  of  the  important  goals  of  networks  of  com¬ 
puter  systems  is  load  sharing.  Jobs  from  a  heavily 
utilized  facility  may  be  shipped  to  a  lightly  loaded 
one  in  order  to  improve  overall  system  performance. 
There  are  many  interesting  problems  concerned  with 
the  properties  of  different  scheduling  policies  for 
such  configurations.  A  recent  paper  by  Balachandran, 
McCredie,  and  Mikhail  1 1  1  outlines  a  number  of  math¬ 
ematical  programming  approaches  to  some  of  these 
problems.  The  analytic  model  presented  in  this  sec¬ 
tion  focuses  upon  one  small  problem  from  the  general 
area  of  load  leveling  policies  in  computer  networks. 

Consider  a  simple  network  of  two  computers 
having  processing  rates  ul  and  u2  jobs  per  unit  time. 
Each  of  these  machines  may  process  any  job  sub¬ 
mitted  to  the  network.  The  machines  are  functionally 
homogeneous,  but  their  rates  differ.  Although  the 
following  scheduling  policy  seems  rather  complicated, 
it  is  conceptually  very  simple.  The  basic  idea  is  to 
process  all  work  at  machine  1  until  the  backlog  at  this 
processor  is  equal  to  (C-1)  jobs  and  then  utilize 
machine  2.  This  policy  seems  counter  productive  at 


first  glance  because  network  capacity  will  be  idle 
when  there  are  jobs  waiting  for  service.  However,  if 
the  rate  of  machine  1,  (ul),  is  greater  than  the  rate  of 
machine  2,  (u2),  it  may  be  advantageous  to  build  up  a 
backlog  at  machine  1  before  utilizing  machine  2.  By 
setting  C=2  the  policy  will  direct  an  arriving  job  to 
machine  1  if  it  is  idle,  and  will  immediately  assign  a 
new  job  to  machine  2  if  it  is  idle  and  machine  1  is 
busy.  By  setting  C=1,  machine  1  will  process  all  work 
and  machine  2  will  always  be  idle.  Figure  2  illustrates 
how  this  system  works. 

The  scheduling  policy  under  study  is  the  follow¬ 
ing:  at  arrival  time,  schedule  a  job  (1)  for  machine  1 
if  it  is  idle  or  if  there  are  less  than  (C-1)  jobs  in  the 
queueing  system;  (2)  for  machine  2  if  there  are  (C-1) 
jobs  in  the  queueing  system,  machine  2  is  idle  and 
machine  1  is  busy;  (3)  for  machine  1  if  there  are  (C-1) 
jobs  in  the  queueing  system  and  machine  2  is  busy; 
(4)  for  no  particular  machine  if  there  are  C  or  more 
jobs  in  the  queueing  system  (for  this  last  case  the  job 
will  be  kept  in  an  "order-of-arrival"  waiting  line  until 
there  are  less  than  C  jobs  in  the  queueing  system  and 
then  it  will  be  dispatched  according  to  the  rules 
presented  above).  The  basic  decision  problem  for  this 
scheduling  algorithm  is  to  choose  C,  as  a  function  of 
ul,  u2,  and  the  job-arrival  rate,  so  that  some  measure 
of  system  performance  is  maximized. 

An  analytic  model  may  be  used  to  investigate  this 
scheduling  policy  to  determine  under  what  circum¬ 
stances,  if  at  all,  it  is  advisable  to  allow  some  system 
capacity  to  remain  idle  when  work  is  available  for 
processing.  Let  the  input  to  the  system  be  from  a 
Poisson  process  with  rate  X,  and  let  the  service  times 
at  each  machine  be  exponentially  distributed  random 
variables  with  parameters  ul  and  u2.  Define  the  fol¬ 
lowing  system  state  probabilities: 

Pk,0  =  probability  that  there  are  k  jobs  waiting  for 
service  from  machine  1,  and  machine  2  is  idle 
(k=0,1,  .  .  .  C-1) 

Pk,i  =  probability  that  there  are  k  jobs  waiting  for 
service  from  machine  1,  and  machine  2  is  busy 
(k=0,1, .  . .  C-2) 

Pk  =  probability  that  there  are  a  total  of  k  jobs  in 
the  system  waiting  in  the  common  ordered 
queue  and  waiting  for  or  being  serviced  by 
machines  1  and  2  (k=C,C+1, .  .  . ) 


C— 1  jobs 


Figure  2  Scheduling  Model 


Using  standard  techniques  for  the  analysis  of  ex¬ 
ponential  queueing  systems  (e.g ,  as  presented  in  the 
book  by  Saaty^6')  one  may  derive  the  following 
steady  state  recurrence  equations  for  the  state  prob¬ 
abilities.  These  equations  apply  only  when  a  steady 
state  exists  (i.e.  when  the  input  rate  is  less  than  the 
total  processing  capacity  ul  +  u2). 

(8)  (A+uUpk.o  =  ulPk+i  ,o  +  *Pk-1,0+u2Pk,i 

k=1, . . .  C-2 

(9)  (A+u1+u2)pk,i  =  2vpk_  i  ,i  +u1Pk+i,i 

k-1 ,2 . C-2 

(10)  (A+u1+u2)pk  =  Apk_i  +  (u1+u2)Pk+i 
k=C+1  ,C+2, .  . . 

It  is  beyond  the  scope  of  this  article  to  describe 
the  solution  of  these  equations  in  detail.  However, 
one  may  use  generating  functions  to  reduce  the  in¬ 
finite  set  of  equations  of  relation  (10)  to  one  simple 
expression.  Then  all  that  is  required  is  to  solve  2C 
linear  equations  by  standard  numerical  techniques. 
Although  a  simple  closed  form  for  the  result  has  not 
been  found,  the  computations  are  straight-forward 
and  easy  to  perform. 


It  is  also  beyond  the  scope  of  this  article  to  de¬ 
scribe  in  detail  the  many  interesting  results  which 
may  be  obtained  by  solving  these  equations  for  vari¬ 
ous  parameter  settings.  However,  a  few  conclusions 
are  easy  to  summarize.  The  expected  value  of  the 
time  spent  in  the  system,  both  waiting  for  and  receiv¬ 
ing  service,  was  the  performance  index  used  for  the 
following  comparisons.  When  the  ratio  of  the  '  ate  of 
machine  1  to  machine  2  is  small  (2  to  4),  then  the 
optimum  value  for  C  is  also  small  (2  to  4)  and  the 
performance  curve  is  relatively  flat.  Thus  the  im¬ 
provement  one  could  expect  from  implementing  this 
type  of  policy  in  this  type  of  situation  is  only  a  few 
percent.  But  as  the  ratio  of  the  processing  rates  in¬ 
creases  to  ten  for  example,  improvements  of  the 
order  of  twenty  to  twenty-five  percent  are  possible 
by  increasing  C  from  2  to  six  or  seven.  Around  the 
optimum  value  of  C  the  performance  curve  is  again 
relatively  flat.  In  this  latter  type  of  situation  it  is 
often  better  never  to  use  machine  2  than  to  set  C=2. 

The  model  described  in  this  section  may  be  used 
to  examine  a  number  of  theoretical  scheduling  ques¬ 
tions.  The  results  of  this  kind  of  analysis  can  help  to 
formulate  realistic  policies.  The  performance  of  oper¬ 
ational  scheduling  algorithms  should  be  examined  by 
simulations  and  measurements  of  prototype  systems. 
However,  the  construction  of  these  more  expensive 
studies  can  be  guided  by  insights  gained  from  the 
analytic  results. 


Memory  Interference  Model 

One  of  the  crucial  problems  in  the  design  of  a 
multiprocessing  computing  system  is  the  interference 
which  occurs  when  more  than  one  processor  requests 
information  from  the  same  shared  memory  (see  Wulf 
181).  Performance  will  be  degraded  in  such  circum¬ 
stances  due  to  queueing  delays.  Strecker  ^  studied 
this  problem  and  presented  a  number  of  models  to 
approximate  the  effects  of  memory  interference. 
Bhandarkar  and  Fuller  t2'  have  recently  surveyed 
techniques  for  analyzing  this  type  of  interference  in 
multiprocessor  systems.  The  analysis  which  follows 
differs  from  these  other  reports  in  a  number  of  ways 
and  represents  an  alternative  framework  for  analytical 
study.  The  present  model  is  based  upon  different 
assumptions  than  those  used  by  Strecker,  Bhandarkar 
and  Fuller.  It  allows  one  to  consider  the  effects  of  a 
cache  memory  for  each  of  the  processors,  as  well  as 
the  situation  in  which  one  of  the  memory  modules 
has  a  different  speed  and  probability  of  being  ac¬ 


cessed  than  all  other  modules.  The  model  discussed  in 
this  section  was  presented  in  greater  detail  in  a  paper 
by  McCredie  I4' . 

Figure  3  illustrates  the  structure  of  the  model. 
There  are  N  processors  each  of  which  may  access  any 
one  of  B  memory  banks  through  an  N  by  B  cross 
point  switch.  Strecker  t7'  defines  an  abstraction 
called  the  "unit  instruction",  and  shows  how  more 
complicated  instructions  may  be  synthesized  from 
various  combinations  of  unit  instructions.  Each  unit 
instruction  consists  of  one  memory  reference  and  a 
random  interval  of  processor  activity.  The  perform¬ 
ance  of  a  particular  organization  of  memories  and 
processors  may  be  measured  in  terms  of  the  mean 
unit  execution  rate  (UER)  which  is  the  mean  speed  at 
which  the  configuration  can  execute  unit  in¬ 
structions. 

For  the  model  of  Figure  3,  the  time  required  to 
decode  and  execute  a  unit  instruction  will  be  an 
exponentially  distributed  interval  having  a  mean  of 


1/LAM  nanoseconds.  At  the  end  of  this  time  the 
processor  must  access  memory.  The  prol^'-i'lity  that 
one  of  the  B  hanks  of  main  memory  will  he  refer¬ 
enced  is  R  and  the  probability  that  the  reference  will 
be  to  the  cache  memory  associated  with  each  proc¬ 
essor  is  ( 1  — R).  The  cache  will  he  assumed  to  have  an 
access  time  that  is  much  smaller  than  the  processor 
delay  and  will  be  ignored.  Since  the  probability  of 
accessing  the  cache  is  independent  of  state  informa¬ 
tion  (such  as  how  many  accesses  have  already  been 
directed  to  the  cache)  the  number,  X,  of  consecutive 
unit  instructions  the  processor  may  execute  before 
referencing  main  memory  is  geometrically  distributed 
with  mean  1/R. 

The  sum  of  a  geometrically  distributed  number  of 
exponential  random  variables  is  another  exponential¬ 
ly  distributed  random  variable.  Thus,  for  each  proc¬ 
essor,  the  time  from  the  completion  of  one  reference 
to  main  memory  until  the  next  access  to  main 
memory  i  exponentially  distributed  with  mean 
1/(R«LAM'.  If  there  is  no  cache  memory  for  each 
processor,  R  is  equal  to  unity.  The  value  of  R  will 
decrease  as  the  size  of  the  cache  increases. 

Assume  that  the  time  that  a  module  of  main 
memory  is  Nocked  while  an  access  is  completed  is  an 
exponentially  distributed  random  variable  with  mean 
1/u  nanoseconds  for  all  memory  banks  but  tbe  first 
which  will  have  a  mean  of  1/v  nanoseconds.  This 
exponential  delay  represents  the  total  cycle  time  of 
main  memory  for  the  different  classes  of  accesses  as 
well  as  any  switching  delay  required  to  link  N  proc¬ 
essors  with  B  memories.  Define  f  to  be  the  probabil¬ 
ity  that  a  request  to  main  memory  will  be  to  the  first 
memory  bank.  Assume  that  requests  to  all  other  (B-1) 
modules  are  uniformly  distributed  and  thus  the  prob¬ 
ability  that  a  request  goes  to  any  particular  memory 
bank  is: 

(11)  P  (reference  to  module  j  when  a  reference  is  to 
main  memory) 

f,  j=1,  0  <  f  <  1 

1-f 

b-^T  i=2 . 8 

A  processor  may  not  issue  a  request  to  any  memory 
module  until  it  has  received  and  processed  the  in¬ 
formation  from  the  preceding  memory  access. 


The  assumptions  st<  ted  in  the  previous  paragraphs 
may  be  modified  slightly  to  adjust  the  model  to  more 
realistic  situations  such  as  processor  utilization  of 
words  from  memory  immediately  after  memory  ac 
cess  and  during  memory  rewrite.  However,  most  of 
the  assumptions  are  required  to  keep  the  mathematics 
reasonable.  Using  a  powerful  theorem  originally  pre 
sented  by  J.  R.  Jackson  I3)  one  may  solve  this  model 
to  determine  the  mean  unit  execution  rate,  UER,  as  a 
function  of  all  of  the  parameters  defined  above.  The 
details  of  the  application  of  this  theorem  to  the 
present  situation  are  contained  in  the  previously 
referenced  paper  by  McCredie.  Although  the  equa¬ 
tions  are  rather  complicated,  the  solution  may  be  31 
evaluated  by  a  straight-forward,  fast  algorithm  which 
has  a  running  time  of  a  few  milliseconds  and  is 
proportional  to  (N2).  The  algorithm  is  presented 
below  in  ALGOL  to  demonstrate  that  it  is  computa¬ 
tionally  quite  simple. 

REAL  PROCEDURE  UER(LAM,  V,U,N,B,R,F); 

REAL  LAM,V,U,R,F; 

INTEGER  N,B; 

COMMENT  LAM,V,U,  and  N  must  be  positive,  R  and 
F  must  be  probabilities  and  B,  the  number  of  mem¬ 
ory  banks,  must  be  greater  than  1; 

BEGIN 

REAL  REFPROB.DENOM,  EM; 

REAL  ARRAY  W(0:N),T(0:N),A(0:N),P(0:N); 

INTEGER  K,J; 

REFPROB:=(10— F)/(B— 1); 

W(0):=T(0)  :=A(0)  :=DENOM:=1 .0; 

EM:=0.0; 

FOR  K:  =  1  STEP  1  UNTIL  N  DO 

BEGIN 

A(K):=A(K— 1SMB+K— 2)/K; 

W(K):=W(K  — 1)*  R*LAM*(N— K+1); 

T(K):=0.0; 

FOR  J:=0  STEP  1  UNTIL  K  DO 

T(K) :=T(K)+A(J) * ( F/V) t ( K  —  J  ) • 
(REFPROB/U)  I  J; 

DENOM:=DENOM+W(K)*T(K); 

END; 

P(0):=1.0/DENOM; 

FOR  K:  =  1  STEP  1  UNTIL  N  DO 

BEGIN 

P(K):=T(K)*W(K)/DENOM: 

EM:=EM+K«P(K), 

END; 

UER:  =  (N-EM)*LAM; 

END  OF  PROCEDURE  UER; 


Two  of  the  necessary  assumptions  for  this  model 
were  that  both  the  processor  execution  times  and  the 
memory  cycle  times  were  exponentially  distributed 
random  variables.  A  uniformly  distributed  processing 
time  and  an  approximately  constant  memory  cycle 
time  are  closer  approximations  to  the  hardware  per¬ 
formance  data  of  multiprocessor  configurations  such 
as  Carnegie-Mellon's  C.mmp  (8I .  To  check  the  effects 
of  these  assumptions  we  built  a  simple  simulation  of 
the  system  and  compared  the  results  for  different 
parameters.  The  simulation  curves  were  similar  to  the 
analytic  results  over  a  wide  range  of  values. 

Summary 

The  primary  goal  of  this  article  is  to  show  that 
32  analytic  models  are  valuable  in  the  overall  study  of 
computer  system  performance.  Even  though  they  are 
usually  simplified  abstractions  of  actual  systems  and 
constitute  only  one  dimension  of  the  total  space  of 
available  techniques,  they  do  have  advantages  in  cer¬ 
tain  areas.  To  capitalize  on  these  advantages,  systems 
analysts  should  be  exposed  to  both  the  power  and 
limitations  of  current  analytic  techniques,  and  re¬ 
searchers  in  the  area  should  strive  to  communicate 
their  results  in  more  usable  ways. 
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Lessons  from  Perception  for  Chess-Playing 
Programs  (and  Vice  Versa ) 

Herbert  A.  Simon 

Introduction 

For  nearly  twenty  years,  artificial  intelligence  and 
cognitive  psychology  have  maintained  a  close  sym¬ 
biotic  relationship  to  each  other.  It  has  often  been 
remarked  that  their  cooperation  stems  from  no  logi¬ 
cal  necessity.  That  a  human  being  and  a  computer  are 
both  able  to  perform  a  certain  task  implies  nothing 
for  the  identity,  or  even  similarity,  of  their  respective 
performance  processes.  Each  may  have  capabilities 
not  shared  by  the  other,  and  may  build  its  perform¬ 
ances  on  those  peculiar  capabilities  rather  than  upon 
those  they  hold  in  common. 

In  spite  of  this  logical  possibility  of  total  irrele¬ 
vance  of  the  one  field  for  the  other,  during  the  last 
two  decades  there  has  been  massive  borrowing  in 
both  directions.  Artificial  intelligence  programs  capa 
ble  of  humanoid  performance  in  particular  task  do¬ 
mains  have  provided  valuable  hypotheses  about  the 
processes  that  humans  might  use  to  perform  these 
same  tasks,  and  some  of  these  hypotheses  have  subse¬ 
quently  been  supported  by  evidence.  Bobrow's  ^ 
STUDENT  program,  for  example,  which  translated 
story  problems  into  algebraic  equations,  provided  a 
model,  later  tested  by  Paige  &  Simon  f  1 1  ]  for  some 

of  the  human  syntactic  processes  in  performing  that 
task. 

Conversely,  hypotheses  and  data  about  human  per- 
formance  have  been  important  inputs  to  artificial 
in  elligence  efforts.  The  General  Problem  Solver,  for 
example,  received  its  early  shape  from  analyses  of 
human  thinking-aloud  protocols  in  a  problem  solvinq 
task  I81 .  a 

The  distance  between  Al  and  cognitive  psychology 
has  not  been  the  same  in  all  task  domains.  Until  quite 
recently,  for  instance,  Al  research  on  theorem  prov¬ 


ing  developed  in  directions  quite  different  from  those 
suggested  by  the  study  of  human  behavior  in  theorem 
proving  tasks.  There  is  little  that  is  humanoid  about 
resolution  theorem  proving. 

In  the  domain  of  chess  playing,  the  distance  be¬ 
tween  Al  and  cognitive  psychology  has  been  neither 
so  close  as  in  the  GPS  example,  nor  so  distant  as  in 
theorem  proving.  The  early  chess  playing  programs,  in 
their  reliance  on  brute  force  and  machine  speed, 
borrowed  little  from  what  was  known  of  human  chess 
playing  processes  l9*.  The  clear  demonstration  by 
their  relatively  weak  levels  of  performance,  that  speed 
was  not  enough,  produced  a  gradual  movement 
toward  incorporating  into  the  programs  some  of  the 
selective  task-dependent  heuristics  that  humans  rely 
heavily  upon  in  their  chess  playing.  However,  the 
strongest  chess  programs  in  existence  today  still  rely 
heavily  upon  extensive  rapid  search,  usually  over 
thousands  or  tens  of  thousands  of  branches  of  the 
game  tree  1 7 ' . 

I  should  like  to  describe  here  some  efforts  on  the 
other  side  of  the  line  -  attempts  to  explore  chess 
Playing  mechanisms  that  can  explain  human  chess 
performance.  These  mechanisms  may  turn  out  to 
have  important  implications  for  the  future  of  chess 
playing  programs  motivated  by  Al  goals.  Their  own 
motivation,  however,  was  largely  psychological. 

MATER 

The  story  begins  with  an  examination  of  those 
kinds  of  chess  positions  where  appropriate  search  will 
disclose  a  checkmating  combination  against  which  the 
opponent  has  no  defense.  We  have  good  evidence  that 
strong  human  players  discover  these  checkmates  in 
over-the-board  play  after  exploring  trees  of  positions 
having  (generally)  only  a  few  dozen  branches.  Simon 
&  Simon  5]  hand-simulated  a  program  that 
achieved  this  kind  of  performance,  and  which  dis¬ 
covered  checkmates  as  deep  as  eight  moves  ( 1 6  plies) 
This  program  was  further  developed  and  implemented 
by  Baylor  &  Simon  HI  in  several  versions  of  the 
MATER  program. 


MATER  relied,  first  of  all,  on  being  able  to  detect 
attack  and  defense  relations  among  pairs  of  pieces  on 
the  board,  and  to  use  this  information  to  guide  its 
search.  On  the  offensive  side  (in  its  simplest  version), 
it  examined  only  checking  moves  -  that  is,  moves 
attacking  the  king;  but  on  the  defensive  side,  it  ex¬ 
amined  all  legal  replies.  (This  is  essential  in  order  to 
demonstrate  that  the  checkmate  cannot  be  escaped.) 
MATER's  second  important  heuristic  was  to  employ 
a  search-and-scan  strategy  —  at  each  stage  it  explored 
first  that  branch  on  the  as-yet  unexplored  portion  of 
its  game  tree  which  allowed  the  opponent  the  fewest 
replies.  The  combination  of  its  selectivity  in  consider¬ 
ing  attacking  moves,  and  its  priority  ordering  for 
attention  o  restricting  moves  gave  it  great  power 
with  mod  ist  amounts  of  search.  In  one  of  its  most 
impressive  performances  —  rediscovering  the  eight- 
move  mat?  from  a  game  of  Edward  Lasker  against 
Thomas  -  he  search  tree  grew  to  only  108  positions, 
and  in  most  positions  it  was  much  smaller. 


The  claim  that  selective  search  could  account  for 
many  aspects  of  human  performance  in  chess  was 
challenged  by  a  number  of  psychologists  who  thought 
that  perceptual  processes,  enabling  a  master  player  to 
see  "at  once"  a  whole  multitude  of  meaningful  rela¬ 
tions  in  a  position  placed  before  him,  held  the  key  to 
skilled  human  chess  playing.  The  Russian  investiga¬ 
tors,  Tichomirov  and  Poznyanskaya  I16',  for  ex¬ 
ample,  recorded  eye  movements  of  a  strong  player  for 
the  first  five  seconds  after  he  was  shown  a  chess 
position  with  instructions  to  find  the  best  move. 
During  these  five  seconds,  there  were  about  twenty 
eye  fixations,  and  almost  all  of  these  fixations  were 
aimed  at  "important"  squares  of  the  board  —  those 
that  a  skilled  player  would  regard  as  important  for 
the  position.  The  edges  and  corners  of  the  board 
received  almost  no  direct  attention.  Moreover,  the 
sequence  of  fixations  could  not  be  correlated  with 
any  possible  tree  of  moves.  Saccadic  movements  of 
the  eyes  from  one  fixation  to  the  next  generally 
passed  along  lines  of  potential  action  between  pairs  of 
pieces.  Thus,  the  eyes  might  move  from  one  piece  to 
another  that  attacked  or  defended  it,  or  was  attacked 
or  defended  by  it. 


To  interpret  the  results  of  Tichomirov  and 
Poznyanskaya,  we  need  a  few  facts  about  the  nature 
of  vision.  The  eye  has  a  central  area,  or  fovea,  about 
1°  in  radius,  of  very  high  resolution,  surrounded  by  a 
much  wider  peripheral  area  (about  7°)  in  which  famil¬ 
iar  objects  can  usually  be  recognized,  but  no  detailed 
information  about  them  can  be  acquired.  Since  the 
angle  between  successive  fixations  is  usually  several 
degrees,  the  information  that  directs  the  saccadic 
movements  must  be  acquired  peripherally. 

Simon  &  Barenfeld  I12'  set  out  to  demonstrate 
that  a  serial  processor  could  simulate  the  observed 
eye  movement  phenomena  without  requiring  the  as¬ 
sumption  that  large  amounts  of  information  can  be 
acquired  instantaneously  and  in  parallel  over  the 
whole  visual  field.  Their  simulation  program,  PER- 
CEIVER,  used  a  stripped-down  version  of  MATER 
(removing  the  executive  routine  that  guided  its  search 
for  mating  combinations)  to  detect  attack  and  de¬ 
fense  relations  between  pairs  of  pieces.  These  rela¬ 
tions,  mce  detected,  drove  the  eye  movements. 

More  specifically,  PERCEIVER  assumed  the  eye 
to  be  fixated,  initially,  on  some  prominent  piece  in 
the  position.  The  attack  and  defense  relations  be¬ 
tween  that  piece  and  other  pieces  would  be  detected 
(presumably  by  a  combinaton  of  foveal  and  peripher¬ 
al  vision),  and  the  eye  would  then  move  to  a  new 
fixation  at  one  of  the  squares  so  related  to  the  point 
of  previous  fixation.  Successive  saccadic  movements 
would  carry  the  eyes  around  the  board,  but  would 
tend  to  move  them  most  often  to  those  parts  of  the 
board  where  the  network  of  chess  relations  among 
pieces  was  densest.  Hence,  PERCEIVER  had  many 
fixations  on  the  "important"  squares,  and  seldom 
strayed  out  to  the  corners  of  the  board.  In  fact,  its 
fixations  and  their  sequence  were  indistinguishable 
from  the  human  eye  movements. 

PERCEIVER  showed  that  the  basic  perceptual 
processes  required  for  the  initial  reconnaissance  of  a 
chess  position  were  just  like  those  that  had  already 
been  incorporated  in  MATER  for  the  search  of  the 
tree  of  moves.  The  amount  of  visual  information  to 
be  acquired  during  the  initial  "perceptual"  phase  was 
not  more  than  could  be  accounted  for  by  this  kind  of 
scanning  process.  There  was  no  evidence  that  the 
Gestalt  of  the  position  was  seized  "instantaneously". 


PERCEIVER 


Reconstructing  Chess  Positions 

Another  chess  perception  phenomenon,  first  dis¬ 
covered  in  1925  in  Moscow  !5),  studied  in  detail  by 
de  Groot  [4)  in  Amsterdam  in  the  1930's,  and  repli¬ 
cated  again  in  our  laboratory  within  the  past  couple 
of  years,  raised  a  different  set  of  questions  about  how 
the  mechanisms  incorporated  in  MATER  and  PER- 
CEIVER  could  account  for  the  perceptual  abilities  of 
skilled  chess  players.  This  phenomenon  was  the  re¬ 
markable  ability  of  chess  masters  and  grandmasters  to 
reproduce  a  position  from  an  actual  game  (not  pre¬ 
viously  known  to  them)  after  they  had  seen  it  for 
only  five  or  ten  seconds. 

In  brief,  the  empirical  findings  are  these:  take  a 
position  1  typically,  with  about  25  pieces  on  the 
board)  frt.  in  a  game  between  strong  players.  Allow  a 
master  to  examine  it  for  five  seconds.  He  will  then  be 
able,  with  about  80%  accuracy,  to  replace  the  pieces 
correctly  on  the  board.  Let  a  weak  player  examine 
the  same  position  for  five  seconds,  and  then  try  to 
reconstruct  it.  He  will  be  able  to  place  only  six  or 
seven  pieces  correctly  on  the  board:  about  25%. 

But  an  equally  surpiising  result  is  obtained  if  we 
now  perform  the  same  experiment  with  a  board  on 
which  the  pieces  have  been  placed  at  random.  Now 
the  performance  of  the  master  falls  to  the  level  of  the 
amateur,  while  the  latter  does  slightly  less  well  than 
before.  That  is  to  say,  both  master  and  amateur  will 
now  recall  the  positions  of  only  about  one  quarter  of 
the  pieces,  and  the  master  will  do  no  better  than  the 
weak  player. 

The  first  part  of  the  experiment  might  seem  to 
suggest  that  the  chess  master  has  unusual  powers  of 
visual  imagery  -  a  hypothesis  about  chess  players 
that  has  been  widely  believed.  But  the  second  part  of 
the  experiment  shows  that  these  visual  powers  evapo¬ 
rate  when  the  situations  are  different  from  those 
encountered  in  actual  chess  play.  Evidently,  the  chess 
master's  superior  perceptive  powers  rest  on  special 
chess  knowledge,  and  not  on  any  unusual  properties 
of  his  visual  or  imaging  system. 

This  experiment  seems  at  first  to  conflict  with 
what  we  know  about  short-term  memory  t 1  0 1  There 
is  a  large  body  of  evidence  to  show  that  one  can  hold 
only  about  a  half  dozen  "chunks"  of  information  in 
short-term  memory.  The  information  —  up  to  about 
that  amount  —  can  be  kept  there  indefinitely,  but 
transferr  q  it  to  long-term  memory  (to  free  up  the 
short-term  memory  for  other  information)  requires 
about  five  or  more  seconds  for  each  chunk. 


The  term  “chunk”  in  this  theory  is  not  quite  as 
vague  as  might  appear.  A  "chunk"  is  any  unit  of 
information  that  is  already  familiar  to  the  subject, 
and  which  he  can  therefore  recognize  as  an  old  friend,' 
Thus,  for  a  native  speaker  of  a  language,  any  common 
word  is  (at  most)  a  single  chunk,  and  even  common 
idiomatic  phrases  (e.g.  "make  or  break")  may  be 
chunks.  Hence  it  is  often  possible  to  estimate  in 
advance  the  number  of  chunks  contained  in  a  given 
stimulus  —  a  string  of  words  or  numbers,  say. 

The  findings  in  the  chess  perception  experiments 
could  be  reconciled  with  the  hypothesis  of  limited 
short-term  memory  if  the  chess  master  could  recog¬ 
nize  a  chess  position  as  a  configuration  of  a  half- 
dozen  chunks  of  three  or  four  pieces  each,  while  the 
amateur  recognized  each  piece  as  a  separate  chunk. 
The  master  s  chunks  would  be  configurations  familiar 
to  him  from  having  seen  the  same  arrangements  of 
pieces  in  many  previous  positions. 

This  hypothesis  has  been  explored  by  Chase  & 
Simon  131  in  a  series  of  experiments  in  which  they 
videotaped  players  reconstructing  positions  and  timed 
the  intervals  between  successive  placements  of  pieces. 
Long  intervals  (over  two  seconds)  were  assumed  to 
represent  chunk  boundaries;  short  intervals  (less  than 
two  seconds)  were  assumed  to  be  within-chunk  inter¬ 
vals.  The  data  gave  support  to  several  aspects  of  the 
hypothesis:  the  chunks  so  defined  were  in  fact  clus¬ 
ters  of  pieces  of  kinds  that  occur  with  high  frequen¬ 
cies  in  games.  Several  kinds  of  evidence  reinforced  the 
plausibility  of  the  two-second  criterion  for  chunk 
boundaries. 

The  master's  chunks  were,  in  fact,  larger  than 
those  of  the  weaker  players  -  perhaps  fifty  per  cent 
larger,  on  average.  To  that  extent  the  short-term 
memory  hypothesis  was  supported.  However,  con¬ 
trary  to  the  hypothesis.  Chase  &  Simon  found  that 
the  master  held  more  chunks  in  memory  (also  by  a 
margin  of  about  fifty  per  cent)  than  did  weaker 
players.  Hence  the  master  appeared  to  have  a  some¬ 
what  larger  short-term  memory  capacity,  measured  in 
chunks,  than  did  the  others.  This  discrepancy  be¬ 
tween  theory  and  data  remains  unexplained  at  pres¬ 
ent,  and  constitutes  one  of  the  important  targets  of 
our  continuing  research  on  this  subject. 
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MAPP 

If  we  take,  for  the  moment,  an  optimistic  position, 
and  assume  that  further  investigation  will  reconcile 
the  chunking  hypothesis  with  the  observed  data,  we 
still  have  to  discover  what  kind  of  organization  of 
processes  would  produce  these  phenomena.  In  the 
interest  of  parsimony,  we  don't  want  to  invent  ex¬ 
planations  ad  hoc  for  this  purpose,  but  wish  to  limit 
ourselves  to  processes  that  are  already  known  to  exist 
from  other  psychological  experiments 

The  MAPP  program  was  written  by  Simon  &  Gil- 
martin  to  simulate  the  phenomena  of  the  posi¬ 

tion-reconstruction  experiment  with  the  help  of  well 
substantiated  mechanisms.  MAPP  can  be  regarded  as 
tfu  olftpmij  tf  a  matrix  Utt&:  the  PEF.C61V6K 
program,  used  to  simulate  the  eye  movements,  and 
EPAM,  a  venerable  simulation  program  first  devised 

uy  r t-tjt,  il  ii  b-  -xf  lalTi  tn-.  i  i  I  a  i  ’.  i  HiS-jltS  tfUiYl  6 

whole  range  of  standard  rote-learning  experiments 

tel. 

Since  the  19th  century,  psychologists  have  been 
studying  the  processes  for  memorizing  syllables, 
either  in  the  form  of  paired  associates  (stimulus  = 
BYX,  response  =  GOV)  or  in  the  form  of  series  (CEV, 
DAR,  CUJ,  et  cetera).  Meaningfulness  and  familiarity 
of  items  have  been  shown  to  have  major  facilitative 
effects  on  learning  (as  much  as  a  three-to-one  increase 
in  learning  rate  for  meaningfulness);  similarity  of 
items,  a  deterrent  effect.  In  a  list,  the  items  at  the 
ends  are  generally  learned  vith  fewer  errors  than  the 
middle  items  (serial  position  curve).  For  materials  of 
d  ytvuu  kind,  dmuui  it  Of  leal ni  ny  15  louyfily  piO|jtn- 
tional  to  total  time.  These  are  illustrative  of  some  of 
the  main  findings  from  rote  lea  ninq  experiments. 

The  EPAM  program  gives  correct  predictions  — 
and  in  many  cases  quantitatively  correct  predictions 
—  of  the  effects  of  these  and  other  learning  variables 
* 1 .  It  is  a  reasonably  well  verified  first  approxima¬ 
tion  to  a  theory  of  rote  learning.  The  MAPP  program 
combines  the  main  EPAM  mechanisms  with  the 
mechanisms  embodied  in  PERCEIVER  in  an  en¬ 
deavor  to  explain  the  chess  position-recognition  data. 
Since  not  all  of  the  detail  of  the  two  parent  programs 

15  leitvdlll  lu  tllfcjfc  Jdld,  MAPP  illdO  (jofdtts  some¬ 
what  stripped-down  versions  of  EPAM  and  PER¬ 
CEIVER. 


MAPP  has  two  main  components:  (1)  a  learning 
program  and  (2)  a  performance  program.  The  learning 
program  is  exposed  to  many  configurations  of  chess 
pieces  (two  to  seven  pieces  each)  of  kinds  that  occur 
frequently  in  chess  games.  It  grows,  through  this 
exposure,  a  large  discrimination  net  that  allows  it  to 
recognize  these  configurations  when  it  encounters 
them  again,  and  which  stores  the  information  needed 
to  reconstruct  each  of  them.  The  net -growing 
processes  are  essentially  the  processes  of  EPAM,  and 
the  configurations  that  become  recognizable  through 
this  learning  are  the  chunks  to  be  held  in  short-term 
memory. 

The  performance  program  of  MAPP  scans  a  chess 
position  that  is  presented  to  it,  looking  for  salient 
pieces.  U  fixates  uti  uach  saiierit  piece,  ar.d  uses  tint 
previously  grown  EPAM  net  to  recognize  the  largest 
possible  configuration  of  pieces  around  it.  If  it  suc¬ 
ceeds  in  feCogr.iiHig  s  eorrligurffliori,  it  trtdfts  in  short 
term  memory  the  address  in  the  EPAM  net  where  the 
information  about  the  configuration  can  be  found. 
Up  to  six  (or  whatever  number  is  specified  by  the 
parameter)  such  chunks  can  be  stored  simultaneously 
in  short-term  memory. 

After  short-term  memory  has  been  filled  -  or  all 
salient  pieces  have  been  scanned,  whichever  occurs 
first  -  information  about  the  board  is  removed,  and 
MAPP  is  instructed  to  reconstruct  the  position,  It 
takes  the  chunk  addresses  stored  in  short-term  mem¬ 
ory,  recovers  from  the  EPAM  net  the  configurations 
corresponding  to  each  of  these  chunks,  and  recon¬ 
structs  the  position  (or  as  much  of  if  as  it  has  stored 
fr»  ViitTiiOTy)  on  the  board. 

How  successful  is  MAPP  in  accounting  for  the 
superior  ability  of  chess  masters  to  reconstruct  posi¬ 
tions?  The  largest  EPAM  net  that  MAPP  has  grown 
thus  far  contains  1,144  configuration',  of  two  to 
seven  pieces  each,  selected  more  or  less  unsystemat¬ 
ically  from  diagrams  in  standard  chess  works.  We 
cannot  be  sure  that  these  are  the  configurations  that 
occur  most  frequently  in  chess  games,  but  they  cer¬ 
tainly  include  a  large  fraction  of  the  configurations  of 
high  frequency.  Using  this  EPAM  net,  MAPP  was  able 
to  replace  55%  of  the  pieces  in  nine  positions.  In 
-xpbi in.e.Rs  wit' i  r.ie  same  nine  positions,  a  master 
replaced  81%  of  the  pieces,  while  a  Class  A  player 
replaced  49%. 


Thus,  given  familiarity  with  1,144  common  con¬ 
figurations  of  pieces,  MAPP  performs  twice  as  well  as 
a  beginner,  a  little  better  than  a  Class  A  player,  and 
not  nearly  so  well  as  a  master.  We  can  now  ask  how 
much  the  EPAM  net  would  have  to  be  expanded  to 
bring  the  performance  of  MAPP  up  to  master  level. 
Since  the  net  already  contains  the  configurations  that 
occur  most  frequently,  each  new  configuration  we 
add  will  be  somewhat  more  rare  than  those  already  in 
the  net  -  hence  will  make  a  less  than  proportional 
contribution  to  performance.  We  cannot  estimate 
what  that  contribution  will  be  without  making  some 
assumption  about  the  frequency  distribution  of  pat¬ 
terns.  It  is  probably  not  unreasonable  to  assume  that 
this  distribution  is  much  like  the  frequency  distribu¬ 
tion  of  words  in  natural  language.  The  latter  distribu¬ 
tion  is  highly  skewed,  and  is  closely  approximated  by 
the  so-called  harmonic,  or  Zipf,  distribution.  In  the 
harmonic  distribution,  when  words  are  arranged  by 
the  frequency  of  their  occurrence,  the  £th  most  fre¬ 
quent  word  occurs  about  1  /k  times  as  often  as  the 
most  frequent  word:  f*  =  (1  /k)  f,-.  (Interestingly 
enough,  when  authors  are  ranked  by  the  numbers  of 
their  publications,  or  cities  by  their  populations,  the 
distributions  also  conform  approximately  to  the 
harmonic  law.) 

If  we  assume  that  the  frequency  distribution  of 
patterns  of  chess  pieces  is  also  a  harmonic  distribu¬ 
tion,  then  we  can  estimate  the  size  of  the  EPAM  net 
required  to  match  the  master  performance.  Taking 
the  continuous  approximation  to  f ,  /i,  the  cumulative 
distribution  is  the  log  function:  F/=  klogei.  From  the 
MAPP  simulation  data,  .55  =  kloge1144.  Solving  this 
equation,  we  find  k  =  .078.  Using  this  value  of  k,  we 
now  calculate  the  size  of  the  net  for  a  performance 
level  of  .81  by  logeN  =  .81  /.078,  whence  N  =  32,000. 

Flow  reasonable  is  it  to  assume  that  a  chess  master 
is  familiar  with  32,000  configurations  of  chess  pieces? 
First,  there  are  a  number  of  other  indirect  ways  for 
estimating  the  size  of  the  net,  all  of  which  yield 
estimates  of  the  same  order  of  magnitude.  Further, 
the  estimate  computed  above  is  of  about  the  same 
size  as  the  natural  language  vocabulary  of  a  college- 
educated  adult.  Such  a  person  might  be  expected  to 
have  a  recognition  vocabulary  in  his  native  language 
of  25,000  to  100,000  words.  When  we  consider  that 
no  one  becomes  a  chess  master  without  some  years  of 
intensive  application  to  the  game  (grandmaster  status 
is  never  achieved  in  less  than  a  decade),  the  estimate 
becomes  quite  plausible;  for,  a  chess  master  has  spent 
about  as  many  hours  staring  at  chess  positions  as 
other  educated  adults  have  spent  staring  at  the 
printed  page. 


There  are  other  tests  of  MAPP  besides  the  relation 
between  its  vocabulary  of  chess  patterns  and  quan¬ 
titative  performance  as  on  the  recognition  task.  We 
can  compare  the  nature  of  the  chunks  it  recognizes 
with  those  recognized  by  human  players  in  the  same 
positions.  The  agreement  is  generally  good.  Flence, 
MAPP  must  be  taken  seriously  as  an  explanation  of 
the  phenomena,  and  it  would  be  desirable,  as  soon  as 
possible,  to  test  it  with  an  EPAM  net  grown  to 
25,000  or  50,000  configurations.  Since  the  smallest 
net  grown  for  the  experiment  occupied  about 
100,000  words  of  PDP-10  memory,  and  since  the 
time  required  to  grow  the  net  was  more  than  an  hour, 
the  experiment  will  probably  not  be  attempted  until 
memories  become  somewhat  larger,  (aster,  and 
cheaper. 

Prospects 

To  understand  the  implications  of  the  research  on 
cuess  perception  for  the  design  of  chess-playing  pro¬ 
grams,  one  other  phenomenon  should  be  discussed.  It 
is  well  known  that  when  strong  chess  players  engage 
in  rapid-transit  games,  taking  only  a  few  seconds  for 
each  move,  their  play  is  weaker,  but  only  moderately 
weaker,  than  when  they  take  a  longer  time  for  their 
moves,  Masters  and  grandmasters  can  play  dozens  (or 
even  hundreds)  of  simultaneous  games  against  strong 
amateurs,  and  win  almost  all  of  them. 

Drawing  upon  what  has  been  learned  about  chess 
perception,  we  can  provide  a  plausible,  though  as  yet 
untested,  explanation  for  such  feats.  Consider  a  pro¬ 
duction  system  programmed  to  play  chess.  The  condi¬ 
tion  part  of  each  production  is  a  configuration  of 
pieces  on  the  board  —  just  such  a  configuration  as  is 
stored  in  the  EPAM  net.  The  action  part  of  the 
production  is  a  move  that  is  to  be  considered  when¬ 
ever  that  configuration  occurs.  The  productions  are 
arranged  in  priority  order,  with  the  most  impoitant  at 
the  head  of  the  list.  Thus,  an  attack  on  a  queen  will 
be  noticed  before  an  isolated  pawn,  The  program 
then  takes  the  first  action  whose  condition  is  satis¬ 
fied. 


Such  a  program  will  undoubtedly  not  play  good 
chess.  It  will  certainly  play  rapid  chess.  What  would 
have  to  be  added  to  it  to  permit  it  to  play  plausible 
chess  must  be  determined  by  experiment.  Notice  that 
for  a  "fair"  test,  a  very  large  number  of  productions 
—  tens  of  thousands  —  would  have  to  be  provided. 
But  the  real  point  at  issue  is  not  whether  a  program 
that  is  "nothing  but"  such  a  production  system  can 
be  a  strong  chess  player.  Rather,  the  point  at  issue  is 
whether  any  program  that  does  not  incorporate  a 
range  of  chess  knowledge  like  that  imbedded  in  the 
production  system  can  play  good  chess. 

The  experiments  I  have  described  bring  us  face-to- 
face  again  with  one  of  the  central  issues  of  artificial 
intelligence:  tc  what  extent  can  intelligence  be  made 
general  and  independent  of  knowledge  about  particu¬ 
lar  subject-matter  fields?  To  the  extent  that  artificial 
intelligence  is  to  be  modelled  on  human  intelligence, 
these  experiments  suggest  that  general  mechanisms, 
however  powerful  and  indispensible,  are  no  complete 
substitute  for  the  ability  to  recognize  a  very  large 
number  of  quite  specific  features  imbedded  in  com¬ 
plex  situations:  if  the  skilled  man  is  an  intelligent 
man,  he  is  also  a  learned  man. 
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